[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[IDD #OMZ-874415]: GOESR ldm feed



Hi Carol,

Steve, Mike and I just finished discussing the situation that you ran
into on typhoon... very strange indeed, and something that we have
never seen before.

The following is more "for the files" than anything else:

- the problem encountered was an ldmd process trying to write a
  new product into the LDM queue but being able to do so because
  all of the products in the queue were locked

  The 6.13.6 ldmd process exited when it was unable to write the
  product, and the LDM kept running.  Steve assured us that this
  was valid behavior.

- we (I) recommended increasing the LDM queue size from 500M to
  8G, but noted that this was to keep in line with the recommendation
  that about an hour be kept in the local queue

  This recommendation allows the LDM to do better duplicate product
  detection and elimination.

  NB: we do NOT think that using a 500M queue was the cause of the
  problem seen!

- the only scenario that makes any sense to us given the symptoms
  we see represented by LDM log file messages is that somehow the
  500M queue got corrupted, and this was manifested in locks on
  products in the queue not being released

  The creation of a new queue, 8G in your case, should have remedied
  the problem in the 500M queue if there was one.  Your reporting that
  you ran into the same problem sometime around 15:46 yesterday afternoon
  _after_ making a new, 8G queue, does not support the guesstimate that
  the problem was a corrupted queue.

  Question:

  - is it possible that the ordering of making a new queue was different
    than what you reported?

    I.e., is it possible that the LDM was restarted while the existing 500M
    queue was still being used, and the problem was run into again at more
    or less 15:46?  We ask this since we are trying to reconcile the LDM
    restarts we see reflected in the LDM log files in ~ldm/var/logs.

Closing comment:

- the LDM registry entry for <datadir-path> in the <pqact> section
  is currently:

  /usr/local/ldm/var/data

  <datadir-path> will be the current working directory for 'pqact'
  invocations.  This why the log file for the 'grbfile.sh' process
  is located in /usr/local/ldm/var/data/logs given that the
  log file specified in the ~ldm/etc/pqact_satellite.conf pattern-action
  file is the relative 'logs/grbfile.log'.  If you want the log files
  for the 'grbfile.sh' process to be put in the same directory as the
  LDM log files, ~ldm/var/logs, you will need to either change the
  <datadir-path> directory to /usr/local/ldm/var, or change the values
  specified in pqact_satellite.conf.

  I find it most useful if all of the LDM related log files are located
  in the same directory, so I recommend modifying either the actions
  in pqact_satellite.conf or changing <datadir-path> in the LDM registry.

  NB: if you change definitions in the LDM registry, the LDM will need
  to be restarted for the change(s) to become active.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: OMZ-874415
Department: Support IDD
Priority: Normal
Status: Closed
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.