[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Support #HQC-205846]: ldm bug?



> I seem to have found an LDM bug, or a least a performance problem.  I am
> trying to connect to a UTAH server that is not responding.   In
> addition, I currently have none of their products in my queue.  The two
> processes that are connecting to this server are chewing up a lot of
> CPU.  I believe what is happening is that every time it tries to connect
> it is spinning through my 4 GB queue to find out what the last product
> is.  This is slowing down the rest of my ldm processing on this machine
> (freshair).  I "niced" the two processes to +19 so they would have a
> smaller effect. Attached are the log entries from my server.  Here are
> the request lines from the ldmd.conf file:
> 
> request EXP             mesowest.dat.gz         cirp.met.utah.edu
> request EXP             mesowest_csv.tbl.gz     cirp.met.utah.edu
> 
> Feb 28 07:52:25 freshair2 cirp[6329] NOTE: Starting Up(6.4.4): 
> cirp.met.utah.edu:388 20060228125223.800 TS_ENDT {{EXP,  "mesowest.dat.gz"}}

The above shows the initial selection-criteria.  Note that the start-time is 
12:52:23.800 on 2006-02-28
and that the message was printed at 07:52:25 on the same day.  The initial 
start-time SHOULD be one hour
before the current time rather than about seven hours later.  Is something 
unusual going on with your clock
or with the logging timestamp?  Alternatively, are you using a negative 
time-offset ("-o" option) or
maximum-latency ("-m" option) when the LDM is started?

> Feb 28 07:53:56 freshair2 cirp[6329] INFO: No matching data-product in 
> product-queue

The downstream LDM process then tried to find the most-recent product in the 
product-queue that
matched the selection criteria.  It found nothing; consequently, the initial 
start-time remained unmodified.

If the selection-criteria start-time is in the future, then no matching product 
will be found in the product-
queue and the downstream LDM will, consequently, search the entire queue.  I 
think this is a problem with
how the start-time was set rather than with the LDM per se (unless the LDM is 
setting the start-time 
incorrectly).

I'll see if I can modify the code so that if it's impossible to find a matching 
product, then the search will
terminate immediately.

> Feb 28 07:53:56 freshair2 cirp[6329] NOTE: LDM-6 desired product-class: 
> 20060228125356.284 TS_ENDT {{EXP,  "mesowest.dat.gz"}}
> Feb 28 07:53:56 freshair2 cirp[6329] ERROR: Terminating due to LDM failure; 
> nullproc_6 failure to cirp.met.utah.edu; RPC: Unable to receive; errno = 
> Connection reset by peer

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: HQC-205846
Department: Support LDM
Priority: Normal
Status: On Hold