[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[TIGGE #CUA-629523]: Re: dataportal not receiving data from tigge-ldm.ecmwf.int



David & Manuel,

> Regarding some of your questions:
> - ping and traceroute are stopped by our firewall. tigge-ldm can only
> communicate via the LDM port, so the only option we can use is ldmping(1).

One way to test connectivity to an LDM besides using ldmping(1) is to use 
rpcinfo(1).  David, what does the following command show when executed on 
Dataportal?

    rpcinfo -n 388 -t tigge-ldm.ecmwf.int 300029

Because the rpcinfo(1) utility is non-standard, you might have to adjust the 
options slightly.

> - I agree that once we go live we will need operational procedures. Our
> operators will be aware of TIGGE and they will call us when necessary.
> But for now, we are not operational.

Understood.

> Now, how can I investigate what is going on ?

At 2006-04-13 18:46 UTC, when David was executing an ldmping(1) on Dataport 
trying to connect to Tigge-ldm's LDM, the LDM log file on Tigge-ldm showed 
nothing regarding Dataportal except that the downstream LDM on Tigge-ldm was 
waiting for data from Dataportal.  Periodic IS_ALIVE inquiries from Tigge-ldm 
to Dataportal were occuring successfully.  This means that an LDM on Tigge-ldm 
could successfully connect to the LDM server on Dataportal but not vice versa 
(I'm thinking "out loud" here).  This means that an outgoing TCP connection to 
port 388 from Tigge-ldm could work but that an incoming one to port 388 
couldn't.  It appears, therefore, that the network that Tigge-ldm is on is 
preventing incoming connections to port 388.  The prevention is odd, however, 
because netstat(1) shows many such connection attempts from Dataportal in the 
SYN_RECVD state, which means that Dataportal attempted to connect to port 388 
on Tigge-ldm but that Tigge-ldm never received the subsequent ACK !
 from Dataportal to complete the connection.  This could be because Tigge-ldm's 
SYN and ACK response to Dataportal's SYN was never received by Dataportal or 
because Dataportal's subsequent ACK was never received by Tigge-ldm.  Because 
the initial SYN from Dataportal to Tigge-ldm was received by Tigge-ldm, 
however, I'm inclined to believe that the problem is caused either by Tigge-ldm 
not sending out a SYN and ACK in response to Dataportal's SYN or by Tigge-ldm's 
network not allowing the SYN and ACK to reach the random origination port on 
Dataportal.

Manuel, verify that any firewall rules on Tigge-ldm will allow incoming 
connections to port 388 from an arbitrary, remote port.

I suspect, however, that network interference is more likely.  Unfortunately, 
I'm not an expert in diagnosing network problems.  I'll talk to some people who 
are, however, and communicate with you soon.

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: CUA-629523
Department: Support IDD TIGGE
Priority: Normal
Status: On Hold