[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030624: HDS feed to/from seistan (cont.)



>From: Robert Leche <address@hidden>
>Organization: LSU
>Keywords: 200306161954.h5GJs2Ld016710 LDM-6 IDD


Bob,

>If it came down to it, is it feasible to change the packet length in
>the application programs to a smaller value.

One of the big benefits of the LDM-6 was the change from a blocked RPC
transaction to an non-blocked RPC.  Along with this was the notion of
sending the entire product as a "chunk" and letting the network layers
worry about breaking the full size into appropriate TCP packets.  This
works very nicely (much more efficiently) at every LDM-6 installation
_except_ LSU.  Why there is a problem at LSU is the mystery to be
solved.  From the email I received from ULM, it seems that their data
feed problems date back almost an entire year, and certainly span the
LSU and ULM upgrades from LDM-5 to LDM-6, so the problem is not in the
LDM-6 approach to moving data.

>Could wx data types  such
>as HDS or other large packet data types  operate with the packet size
>reduced.

One of the major improvements in LDM-6 was to stop worring about
sending chunks of data less than or equal to 16K bytes and simply send
the entire product as one unit and let TCP worry about its
packetization.

>In reality,  application data is always fragmented.  As an
>example, the Ethernet port that plugs into your computer has a 1500
>byte payload limit.

That is correct.  It is also the reason why the user application should
not need to worry about the size of chunks it sends.

>So fragmentation is always part of the TCP/IP assembly-reassembly process.

Again, you are correct.  And, the application should not have to worry
about sending smaller chunks especially if they also happen to be
larger than the TCP MTU size.

We must not lose site of the fact that the ability to move data to and
from the SRCC domain at LSU is not symmetric.  We are moving all of the
HDS datastream to seistan with latencies that are typically a second or
less.  Trying to feed that same data back to a host in the
unidata.ucar.edu domain results in latencies that routinely approach
6000 seconds.  This is clear proof that there is something happening
somewhere at LSU that is limiting the flow out of LSU.  We have not
been able to determine exactly where that throttling is, but we can
definitely say that it is at or near LSU: in srcc.lsu.edu; in lsu.edu; or
in the LSU connection to Abeline.

Given the above, it is imperative that the LSU telecommunications folks
get seriously involved in finding where the problem is and fixing it.
Since we have absolutely no clout with LSU IT, the job of convincing
the appropriate folks unfortunately falls on your/your department's
shoulders.

We are hoping to have enough time tomorrow to run an scp test through
port 388 to see if the throttling is port dependent.  Previous scp
tests do seem to show an asymmetry between pull from
zero.unidata.ucar.edu to seistan.srcc.lsu.edu and pull from
seistan.srcc.lsu.edu to zero.unidata.ucar.edu, but the asymmetry seen
does not come close to what we are seeing with LDM movement of
data out of srcc.lsu.edu.  Since scp uses a port other than 388, we are
actually hoping that an scp test using port 388 (with the LDM shut off,
of course) _will_ show the same throttling.  If it does, it will prove
that somewhere along the path some throttling is being done, and it is
probably intentional.  A positive scp result such as this should be
enough ammunition to get whoever is doing the throttling (packet
shaping) to finally admit that they are doing it and possibly turn it
off.

The struggle continues...

Tom