[ldm-users] 20080216: Nexrad Level 3 (cont.)

Hi Jeff,

re:
>I reset my time update to use ntpd I noticed the that at 0z my latency was 
>almost nil, then as the day progressed I would hit a high of almost 6
>seconds, and then fall right off to nil again at 0z

Excellent. This is how things should be.

>.. I found the root CRON and seen the ntpdate daemon running
>at 0z everyday so that solved that .. my system is now using ntpd.

Very good.

>I noticed that this arbitrary problem stopped happening at 21z the last time
>I have seen it.

Hmm...  This would explaine why I saw nothing amiss during the
00:57-01:10Z timeframe when I was actively comparing ingest on your
machine against ingest on one of ours.

re: Are you relaying the Nexrad Level 3 data to other sites using the LDM?

>No I am not relaying Level 3 data, it is filed on this server and
>then I have 12 people that use a desktop Level 3 program to view the
>data.

OK, this was what I was trying to determine.  If your LDM was receiving
the data in a timely manner (i.e., very low latencies and no gaps),
this would imply that there was some sort of a problem in the
processing of the data out our your LDM queue.  The most typical reason
for slow processing out of an LDM queue is some sort of a bottleneck on
the machine like high disk I/O.

>I stop getting data, its like my upstream LDM's have stopped receiving it
>themselves, I get no error in the ldmd.log, just the data is not there,

The LDM log file, ldmd.log, will only have entries if there are
problems or if connections are switching.  If you see no entries in
ldmd.log, it means that your LDM/IDD connection is likely to be OK.  If
your impression of the "data is not there" is formed from looking at
data on disk (i.e., processed out of the LDM queue by an LDM FILE
action or LDM decoder), then it is not a direct indication that you are
not receiving the data.  In order to determine what your problem really
is, it is important to understand exactly where the hiccup is.

>the 
>only thing I see in the log for the 15-20 minute period is the EXP and
>DDPLUS data, and a handful of NEXRAD data, no where near the normal.

Please send a relevant snippit from the log file you are referring
to.

re: What is your pqact pattern for processing the Level
3 products you are receiving?

>NEXRAD ^SDUS[2357]. .... ([0-3][0-9])([0-2][0-9])([0-6][0-9]).*/p(...)(...)
>FILE -close data/gempak/nexrad/NIDS/\5/\4/\4_(\1:yyyy)(\1:mm)\1_\2\3

OK.  Please note that the LDM FILE action does NOT log the receipt
or filing of products.  This is one of the reasons that I adopted
the uses of 'ldmfile.sh' for LDM FILEing.  'ldmfile.sh' FILEs a product
AND writes a log message of that filing.  'ldmfile.sh' can be FTPed
from the Unidata anonymous FTP server:

machine:   ftp.unidata.ucar.edu
user:      anonymous
pass:      admin@xxxxxxxxxxxxxxxxxxxx
directory: pub/ldm/scripts
file:      ldmfile.sh

The header of the script outlines how to use it.

>of coarse the copy and paste took out the tabs but they are there.

OK.

>I doubt is its related or not, but when Gerry chimed in and
>said he would re-start bigbird, that is about the time the
>problem corrected itself.

Interesting, but: since you are redundantly requesting NNEXRAD data
from weather2.admin.niu.edu, and it gets its feed from
idd.unidata.ucar.edu, not bigbird.tamu.edu, it is unlikely that Gerry's
restart of bigbird.tamu.edu would have the effect seen.

>now that I look at this

>http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?NNEXRAD+level3.michiga 
>nwxsystem.net

>it is showing up the spikes in latency upwards of 800 seconds,
>as I compare it with Tylers machine

>http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?NNEXRAD+level3.allisonhouse.com

>he is showing the same odd spikes at the or close to the same times I am
>now its on bigbird as well
>http://www.unidata.ucar.edu/cgi-bin/rtstats/iddstats_nc?NNEXRAD+bigbird.tamu.e
> du
>so it looks like it was very hit and miss, and coming from before my 
>upstream hosts ...

The spikes you are seeing are superimposed on a background of
consistently low latencies from other upstream IDD nodes.  It is most
likely that those spikes are a result of "recirculation" -- a product
that has already been recieved is received again from a different
upstream IDD node that is slower since the already received product has
laready been scoured out of your LDM queue.

>Or am I not reading these correctly ??

The high latency spikes are usually caused by a "recirculation"
phenomena, and so are not associated with the "burstiness" you were
reporting.

Cheers,

Tom
--
+----------------------------------------------------------------------------+
* Tom Yoksas                                            UCAR Unidata Program *
* (303) 497-8642 (last resort)                                 P.O. Box 3000 *
* yoksas@xxxxxxxxxxxxxxxx                                  Boulder, CO 80307 *
* Unidata WWW Service                            http://www.unidata.ucar.edu/*
+----------------------------------------------------------------------------+


  • 2008 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the ldm-users archives: