[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #HLR-124311]: ldm drops



Hi Waldenio,

Steve is at home today (and tomorrow is a holiday: US Independence Day), so I 
will
try to help out.

re:
> Here I have frequent LDM errors,

The version of the LDM that you are running, LDM-6.4.4, was overly verbose in 
some
of its logging.  The amount of 'notice' (informational) logging was reduced in
newer versions like LDM-6.4.5 and the beta release LDM-6.4.6.4.  If you decide
to build, install, and run a newer version of the LDM, please use the beta.  It
is located in the pub/ldm/beta subdirectory of anonymous FTP:

machine:   ftp.unidata.ucar.edu
user:      anonymous
pass:      address@hidden
directory: pub/ldm/beta
file:      ldm-6.4.6.4.tar.Z

> and the server drops with no explanation

If you mean that your LDM exits, then this is a problem.  Your log file
listing shows that someone/something told the LDM to exit; please read on...

> Some SIGINTs appears on the logs.

The SIGINTs I see in the log file listing you sent are expected since someone
issued a SIGTERM to the lead rpc.ldmd process.  A SIGTERM is only sent to the
lead rpc.ldmd process when someone stops the LDM (using 'ldmadmin stop'), or
when someone/something sends the TERM kill signal
(using 'kill -TERM <pid of lead rpc.ldmd>.  Our experience in these types
of cases is someone there ran an 'ldmadmin stop' or 'ldmadmin restart'.

> This is happening every month. I didnt had this problem with earlier
> versions of LDM like 6.3

One cause of the new log messages you are seeing is LDM-6.4.4's verbosity
in logging.

Another you will see with LDM-6.4.x is feed requests exiting to change state:

ALTERNATE -> PRIMARY
 or 
PRIMARY -> ALTERNATE)

This occurs when one is requesting the exact same set of data from more than
one upstream feed host.  It is normal and should be expected.

By the way, toplevel IDD relay nodes should have enough bandwidth to redundantly
request all data _except_ CONDUIT redundantly.  We believe it is best for the
toplevel relays to force those data requests to act in PRIMARY mode.  This
is done by making the regular expression for the data pattern unique.  Here
is an example of what we recommend:

request IDS|DDPLUS   "(.*)"  idd.unidata.ucar.edu
request IDS|DDPLUS   ".*"    idd.cise-nsf.gov

The parentheses ('(' and ')') around the '.*' in the first request line makes
the request different than the second.  This is how we do data ingest on
the toplevel relay nodes that we maintain (e.g., idd.unidata.ucar.edu and
idd.cise-nsf.gov).

> There is an explanation for these errors

Some are ignorable due to LDM-6.4.4's logging verbosity.  Some are a reflection
of the state change in feed requests to upstream hosts.  Again, the SIGINTs that
follow the SIGTERM are expected when one stops/restarts the LDM.

Cheers,

Tom
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: HLR-124311
Department: Support LDM
Priority: Normal
Status: Closed