[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20010212: ldm - low priority



>From: "Kopp, Fred" <address@hidden>
>Organization: South Dakota School of Mines
>Keywords: 200102121843.f1CIhRL01301 McIDAS LDM ldmfail cron

Fred,

>I have noticed that when our primary site dies and ldmfail switches us to
>the failover site, the mcidas data crashes.  The log files before and after
>are enclosed below. The failover occurred at 15:20:10. I don't know if I
>have done something wrong or what. Anyway, to cure the problem, the product
>queue must be deleted and remade. (our upstream site is papagayo.unl.edu and
>the failover is weather.admin.niu.edu)

>Feb 11 07:16:07 5Q:squall pnga2area[12248]: Starting Up
>Feb 11 07:16:07 5Q:squall pnga2area[12248]: unPNG::   135459    309200
>2.2826
>Feb 11 07:16:07 5Q:squall pnga2area[12248]: Exiting
>Feb 11 07:17:30 5Q:squall pnga2area[12295]: Starting Up
>Feb 11 07:17:30 6Q:squall pnga2area[12295]: output file pathname:
>/opt/ldm/data/gempak/images/sat/GOES-10/8km/WV/WV_20010211_0700
>Feb 11 07:17:30 5Q:squall pnga2area[12295]: unPNG::   144492    613376
>4.2451
>Feb 11 07:17:30 5Q:squall pnga2area[12295]: Exiting
 ...

>Feb 11 15:20:16 5Q:squall DCMETR[2115]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCUAIR[2117]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCMSFC[12164]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCGRIB2[18010]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCACFT[5162]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCLSFC[10499]: Terminate Signal
>Feb 11 15:20:33 5Q:squall rpc.ldmd[18079]: Starting Up (built: Jan 23 2001
>15:31:39)
>Feb 11 15:20:33 5Q:squall weather[18164]: run_requester: Starting Up:
>weather.admin.niu.edu
>Feb 11 15:20:33 5Q:squall weather[18164]: run_requester: 20010211151523.484
>TS_ENDT {{IDS|DDPLUS,  ".*"},{HDS,  "^[YZ].[RQU].*/mETA"},{HDS,  "^Y.[AI]...
>KWBH"},{HDS,  "^[YZ].Q.*/mRUC"},{MCIDAS,  "^pnga2area Q[01]"}}
>Feb 11 15:20:34 5Q:squall weather[18164]: FEEDME(weather.admin.niu.edu): OK
>Feb 11 15:20:34 5Q:squall pqact[18206]: Starting Up
>Feb 11 15:20:34 5Q:squall pqbinstats[18166]: Starting Up (18079)
>Feb 11 15:20:36 5Q:squall localhost[17885]: Connection from localhost
>Feb 11 15:20:36 5Q:squall localhost[17885]: Connection reset by peer
>Feb 11 15:20:36 5Q:squall localhost[17885]: Exiting
>Feb 11 15:20:42 3Q:squall pqact[18206]: pbuf_flush 10: time elapsed
>4.179174
>Feb 11 15:22:33 3Q:squall pqact[18206]: pbuf_flush (12) write: Broken pipe
>Feb 11 15:22:33 3Q:squall pqact[18206]: pipe_dbufput:
>-closepnga2area-aSATANNOT-bSATBAND/opt/ldm/data/gempak/images/sat/SOUNDER/14
>km/CTP/CTP_20010211_1400 write error
 ...

What is going on is that the PATH in effect after ldmfail runs does not
contain the directory where the ldm-mcidas decoder, pnga2area, "lives".
Notice how the first failure above does not include mention of 'pnga2area'.
Presumably, you are running ldmfail like other users: through cron.
The set of environment variables in effect from things run through cron
is sparce.  Here is a snippit of the man page for crontab from a Sun
Solaris system:

     The shell is invoked from your $HOME directory with an  arg0
     of  sh.   Users  who  desire to have their .profile executed
     must explicitly do so in the crontab file.  cron supplies  a
     default environment for every shell, defining HOME, LOGNAME,
     SHELL(=/bin/sh), TZ, and PATH.  The default  PATH  for  user
     cron  jobs  is  /usr/bin;  while  root  cron jobs default to
     /usr/sbin:/usr/bin.   The  default  PATH  can  be   set   in
     /etc/default/cron; see cron(1M).

So, you have a couple of options:

o edit ldmfail to more fully set the PATH to include all directories
  needed to run decoders; this is not a real great idea since new
  versions of ldmfail come out with new versions of the LDM

o forcably set PATH in your crontab entry.  Here are skeletal examples for
  doing this for first C shell users and then Bourne shell users:

  C shell

* * * * * * source .cshrc; ldmfail...

  Bourne shell


* * * * * * . .profile; ldmfail...

o finally, you could edit your ~ldm/etc/pqact.conf file and be more
  explicit about where to find decoders.  For instance, if pnga2area
  can be fund in the ~ldm/decoders directory, you could change
  pnga2area invocations from:

        PIPE    -close
        pnga2area ...

  to

        PIPE    -close
        decoders/pnga2area

The choice of tactic is up to you, but I would recommend that the last
option might be the easiest to live with since pqact.conf entries do
not usually need to be changed when installing new LDM releases.

Tom Yoksas

>From address@hidden Mon Feb 12 13:31:24 2001
>Subject: RE: 20010212: ldm - low priority 

Tom,
   Thank you. I stole the pqact.conf files from unidata and made only
a few changes. I see that decoders/Xdecoder is what the other decoders
use so I did that change. If this doesn't work I will let you know.

Fred