[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #WOM-274153]: Reproducible bug in LDM 6.13.10



Hi Gilbert,

[I merged your emails into the relevant support-email thread.]

> OK, due to a satellite dish issue, we had to move our NOAAport data feeds
> to backups
> on ldm01.allisonhouse.com this Saturday morning. Here is a step-by-step
> process of what I did:
> 
> Editied ldmd.conf on ldm01.allisonhouse.com to point away from
> bird01.allisonhouse.com to backups using pico.
> #NotaVIfan
> 
> # Text data. This will also grab the NWWS feed from bird01 if the
> # Novra config is set to receive it.
> #####request IDS|DDPLUS "(.*)" bird01.allisonhouse.com
> request IDS|DDPLUS ".*" idd.aos.wisc.edu
> request IDS|DDPLUS "(.*)" atlas.niu.edu
> 
> Normally, on ldm01, bird01 is uncommented, and idd.aos.wisc.edu is our
> second primary.

OK.

> As user LDM on ldm01:
> 
> ldmadmin stop
> ldmadmin delqueue
> ldmadmin clean

I take it there were no problems with these commands.

> As user root:
> systemctl start ldm

OK.

> I log into nfs01.allisonhouse.com and note no log entries showing anything
> happened to the feed from ldm01.

There were no messages from the downstream LDM processes on nfs01 about the 
connection to the upstream LDM process being broken?

> But, at the time either after ldm01 has the LDM shut down or comes back
> online, the LDM on NFS01 continues to ingest, but immediately stops writing
> to disk.

What are the pqact(1) actions that aren't writing to disk? FILE or PIPE?

> There are NO log entries on nfs01 showing anything out of the
> ordinary; everything appears to be normal. But, nothing is writing to disk,
> even though an "ldmadmin watch" on nfs01 shows all feeds apparently coming
> in just fine.

Including IDS|DDPLUS?

> But, this command in LDM's crontab on nfs01:
> 
> 1,6,11,17,21,26,32,36,41,47,51,56 * * * * /bin/bash -l -c 'wasReceived -f
> "WMO|NIMAGE|NGRID|NEXRAD3" -o 180' || /bin/mail -s 'NOAAPORT data has not
> been received in the last 3 minutes on nfs01' address@hidden,
> address@hidden,address@hidden,address@hidden
> </dev/null
> 
> Gets me the dreaded alert via text and emails:
> 
> NOAAPORT data has not been received in the last 3 minutes on nfs01

This is inconsistent with your assertion that an "ldmadmin watch" on nfs01 
shows all feeds continuing to arrive.

> And the other feeds eventually chirp as well, as they are on longer alert
> times to allow for NWS feed issues and glitches. Again, no log entries on
> NFS01 show anything wrong or even that it cannot connect to ldm01's LDM. I
> double checked...this is just weird, Steve. Both are running CentOS 7, and
> are fully patched. Here is a partial NOAAport entry on NFS01.
> 
> # Text data. Will also receive NWWS when Novra is configured on bird01 to
> do so.
> request IDS|DDPLUS "(.*)" ldm-central1-b.c.tough-volt.internal
> #request IDS|DDPLUS ".*" bird01.allisonhouse.com
> request IDS|DDPLUS ".*" idd.aos.wisc.edu
> 
> ldm-central is an internal network name for ldm01. So, the feed obviously
> successfully switches over to idd.aos.wisc.edu, but the queue is corrupt.
> So odd.

So the downstream LDM on nfs01 that requests IDS|DDPLUS from idd.aos.wisc.edu 
stops inserting such products into the queue?

> I noticed you have a beta out. Do you think the bugfixes might patch this
> issue?

Possibly, but I don't see how.

Are the clocks on all the systems correct?

> One more thing: Knowing this was likely going to happen, I was logged onto
> NFS01 and looked for unusual things. This time, there was the one-to-one
> pqact requests; no duplicates that I saw, anyway.

When this happens, can you attach gdb(1) to one of the pqact(1) processes that 
isn't writing anything to disk and send me a stack trace?

Regards,
Steve Emmerson

Ticket Details
===================
Ticket ID: WOM-274153
Department: Support LDM
Priority: High
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.