[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030304: latencies on sunny and a small mod made to LDM 6.0



>From: Harry Edmon <address@hidden>
>Organization: University of Washington
>Keywords: 200303041956.h24Jus327228 IDD LDM-6

Hi Harry,

>Can you explain the very large latencies I see in the following graph:
>http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats?IDS|DDPLUS+sunny.atmos.washington.edu

Yes.

Oh, so you want the explanation ;-)

Here is what was going on:

o the LDM on the IDD relay node in SSEC, unidata2.ssec.wisc.edu is running
  with a 2 GB queue, but it is not moving a lot of data.  Therefore, there
  is some really old data in its queue.

o the LDM on thelma was not updating the timestamp for the latest product
  received for products that were rejected as being duplicates.

o unidata2.ssec.wisc.edu was rebooted last night (as per request from
  Mike Schmidt who had installed system patches)

o upon once again being able to get to unidata2, the LDM on thelma
  requested IDS|DDPLUS products that were newer than the last one it
  received - from unidata2 for the particular feedtype -, that was
  rejected as being a duplicate.  The time used for the reconnect
  request, was, therefore very old (like 20000) seconds.  So, unidata2
  started sending thelma very old products in IDS|DDPLUS even though
  thelma had received those products a long time ago.  The products
  were not rejected since their MD5 checksum did not match any product
  in the queue.  At the same time, thelma is receiving real-time
  products from our NOAAPORT ingest machine, jackie, so their latencies
  are near nothing.  The result of having old products from unidata2
  and new products from jackie was the graph you note above that looked
  like it was color filled.

Steve noticed the problem above this morning (before your email) from
latency plots for thelma:

http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats?IDS|DDPLUS+thelma.ucar.edu

He implemented a code modification and we rebuilt the LDM on thelma
(for good measure, we rebooted to install a boot time mod that Mike S.
had made).  After the LDM restart on thelma, the old products from
unidata2 were correctly not requested, so the latencies in the plots
dropped to zero.

Onto another point:  The request lines you have in the ldmd.conf
file on sunny are incorrect.  You are apparently requesting products
from 'thelma.unidata.ucar.edu.' instead of 'thelma.ucar.edu'.  Two
things to note here:

1) leave the trailing '.' off of the upstream machine name
2) use thelma.ucar.edu and _not_ thelma.unidata.ucar.edu

Please make the above changes and stop and restart the LDM on sunny.
As soon as you have done this, let us know.  We want to disable
thelma.unidata.ucar.edu as a valid name, but we can't do this until you
have changed sunny (and any other machines you have that request data
from thelma.unidata.ucar.edu).  Thanks in advance.

One last thing.  There is a new ldm-6.0.tar.Z file in the pub/ldm/test
directory on ftp.unidata.ucar.edu.  This recut of LDM 6.0 contains the
mod to correctly update the time for duplicate products that are
rejected so that a request to an upstream site will not get really old
products after the upstream's LDM is restarted.  This mod shouldn't
affect anything on leaf node IDD machines, but it can have consequences
on relays as we learned from thelma this morning.  You may want to
regrab ldm-6.0.tar.Z and rebuild your distributions, but it is not
critical.

Tom

From: Harry Edmon <address@hidden>
Date: Tue, 4 Mar 2003 14:41:51 -0800 (PST)
To: address@hidden
Subject: Re: 20030304: latencies on sunny and a small mod made to LDM 6.0

sunny has been rebooted with the latest 6.0 and with the changes to ldmd.conf
you suggested.  I will now work on the other systems.

-- 
 Dr. Harry Edmon                        E-MAIL: address@hidden
 206-543-0547                                   address@hidden
 Dept of Atmospheric Sciences           FAX:    206-543-0308
 University of Washington, Box 351640, Seattle, WA 98195-1640