[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20020730: LDM latency problem - followup



>From: Mike Voss <address@hidden>
>Organization: SJSU
>Keywords: 200207302019.g6UKJL905705

Mike,

I am the person who is actively looking at your problem, but I do
have some comments.

>This is  follow-up regarding my latency problem. Larry at UCSD was kind
>enough to allow me to feed another machine (methost8.met.sjsu.edu), which is
>beefier than rossby.met.sjsu.edu.

I believe you said that rossby is a Sun Ultra 60.  This is more than enough
machine to run the LDM ingesting any datastream you could imagine.

>However, that machine also has the same problems 
>keeping up, which leads me to believe that this problem is network related.

I believe that it is either related to your network connection, or to
how you are requesting the products.  It is well known that if you try
to ingest all products from all data feeds through a single ldmd.conf
request line, that you will not be able to ingest as much data as if you
split your feed requests.  Splitting the feed requests means that you
request different feeds with different ldmd.conf request lines that
specify different hosts, or different names for the same host.  The simlest
way to split feeds is to request some of the data using the name of 
the upstream host in the request line; and other data using the IP address
of the upstream host.  For sites ingesting high-volume data feeds like
CONDUIT and NNEXRAD, further splitting is usually in order.  This is
done by the site creating aliases for the upstream host in their
/etc/hosts file, and using those aliases to create additional, request
lines for smaller amounts of data in ldmd.conf.

>After doing several more "traceroutes" I notice that it does hang once in a
>while as indicated by the *'s:
>
>Notice the *'s at stop 10 and 12 for example:
>------
>(su):~#traceroute aeolus.ucsd.edu
>traceroute to aeolus.ucsd.edu (132.239.114.58), 30 hops max, 40 byte packets
> 1  gateway-97.met.sjsu.edu (130.65.97.254)  3 ms  3 ms  2 ms
> 2  cc1omnicore.sjsu.edu (130.65.5.254)  7 ms  1 ms  1 ms
> 3  horst.sjsu.edu (130.65.11.1)  1 ms  1 ms  1 ms
> 4  cisco3620.sjsu.edu (130.65.11.5)  2 ms  2 ms  2 ms
> 5  QSV-SJSU-ATM.CSU.net (137.145.203.105)  3 ms  3 ms  3 ms
> 6  QSV-GSR--QSV-7513.CSU.net (137.145.202.161)  3 ms  3 ms  3 ms
> 7  QANH-GSR--QSV-GSR.CSU.net (137.145.202.118)  18 ms  19 ms  18 ms
> 8  QAnh-C2-GSR.CSU.net (137.145.11.26)  19 ms  19 ms  30 ms
> 9  USC--QAnh.POS.calren2.net (198.32.248.18)  28 ms  20 ms  20 ms
>10  UCSD--USC.POS.calren2.net (198.32.248.34)  18 ms *  18 ms
>11  198.32.248.186 (198.32.248.186)  23 ms  23 ms  25 ms
>12  * sio-rsm--ucsd-gw.ucsd.edu (132.239.255.145)  20 ms  19 ms
>13  aeolus.ucsd.edu (132.239.114.58)  24 ms  21 ms  28 ms
>
>The big question, does this in and of itself indicate a problem, which could
>be causing my latency?

A '*' in the traceroute listing usually indicates that a name server didn't
return back a machine name within the timeout period.  This should not
be an indication of a feed problem.

I must say that the traceroutes from you to UCSD look very good.  You
should be able to feed a LOT of data with this good of a connection.

So, I think that the solution to your problem is to split your feed
requests to your upstream host(s).

>Thanks again for you help,

Tom