[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20010212: ldm - low priority



>From: "Kopp, Fred" <address@hidden>
>Organization: South Dakota School of Mines
>Keywords: 200102121843.f1CIhRL01301 McIDAS LDM ldmfail cron

Fred,

>I have noticed that when our primary site dies and ldmfail switches us to
>the failover site, the mcidas data crashes.  The log files before and after
>are enclosed below. The failover occurred at 15:20:10. I don't know if I
>have done something wrong or what. Anyway, to cure the problem, the product
>queue must be deleted and remade. (our upstream site is papagayo.unl.edu and
>the failover is weather.admin.niu.edu)

>Feb 11 07:16:07 5Q:squall pnga2area[12248]: Starting Up
>Feb 11 07:16:07 5Q:squall pnga2area[12248]: unPNG::   135459    309200
>2.2826
>Feb 11 07:16:07 5Q:squall pnga2area[12248]: Exiting
>Feb 11 07:17:30 5Q:squall pnga2area[12295]: Starting Up
>Feb 11 07:17:30 6Q:squall pnga2area[12295]: output file pathname:
>/opt/ldm/data/gempak/images/sat/GOES-10/8km/WV/WV_20010211_0700
>Feb 11 07:17:30 5Q:squall pnga2area[12295]: unPNG::   144492    613376
>4.2451
>Feb 11 07:17:30 5Q:squall pnga2area[12295]: Exiting
 ...

>Feb 11 15:20:16 5Q:squall DCMETR[2115]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCUAIR[2117]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCMSFC[12164]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCGRIB2[18010]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCACFT[5162]: Terminate Signal
>Feb 11 15:20:16 5Q:squall DCLSFC[10499]: Terminate Signal
>Feb 11 15:20:33 5Q:squall rpc.ldmd[18079]: Starting Up (built: Jan 23 2001
>15:31:39)
>Feb 11 15:20:33 5Q:squall weather[18164]: run_requester: Starting Up:
>weather.admin.niu.edu
>Feb 11 15:20:33 5Q:squall weather[18164]: run_requester: 20010211151523.484
>TS_ENDT {{IDS|DDPLUS,  ".*"},{HDS,  "^[YZ].[RQU].*/mETA"},{HDS,  "^Y.[AI]...
>KWBH"},{HDS,  "^[YZ].Q.*/mRUC"},{MCIDAS,  "^pnga2area Q[01]"}}
>Feb 11 15:20:34 5Q:squall weather[18164]: FEEDME(weather.admin.niu.edu): OK
>Feb 11 15:20:34 5Q:squall pqact[18206]: Starting Up
>Feb 11 15:20:34 5Q:squall pqbinstats[18166]: Starting Up (18079)
>Feb 11 15:20:36 5Q:squall localhost[17885]: Connection from localhost
>Feb 11 15:20:36 5Q:squall localhost[17885]: Connection reset by peer
>Feb 11 15:20:36 5Q:squall localhost[17885]: Exiting
>Feb 11 15:20:42 3Q:squall pqact[18206]: pbuf_flush 10: time elapsed
>4.179174
>Feb 11 15:22:33 3Q:squall pqact[18206]: pbuf_flush (12) write: Broken pipe
>Feb 11 15:22:33 3Q:squall pqact[18206]: pipe_dbufput:
>-closepnga2area-aSATANNOT-bSATBAND/opt/ldm/data/gempak/images/sat/SOUNDER/14
>km/CTP/CTP_20010211_1400 write error
 ...

What is going on is that the PATH in effect after ldmfail runs does not
contain the directory where the ldm-mcidas decoder, pnga2area, "lives".
Notice how the first failure above does not include mention of 'pnga2area'.
Presumably, you are running ldmfail like other users: through cron.
The set of environment variables in effect from things run through cron
is sparce.  Here is a snippit of the man page for crontab from a Sun
Solaris system:

     The shell is invoked from your $HOME directory with an  arg0
     of  sh.   Users  who  desire to have their .profile executed
     must explicitly do so in the crontab file.  cron supplies  a
     default environment for every shell, defining HOME, LOGNAME,
     SHELL(=/bin/sh), TZ, and PATH.  The default  PATH  for  user
     cron  jobs  is  /usr/bin;  while  root  cron jobs default to
     /usr/sbin:/usr/bin.   The  default  PATH  can  be   set   in
     /etc/default/cron; see cron(1M).

So, you have a couple of options:

o edit ldmfail to more fully set the PATH to include all directories
  needed to run decoders; this is not a real great idea since new
  versions of ldmfail come out with new versions of the LDM

o forcably set PATH in your crontab entry.  Here are skeletal examples for
  doing this for first C shell users and then Bourne shell users:

  C shell

* * * * * * source .cshrc; ldmfail...

  Bourne shell


* * * * * * . .profile; ldmfail...

o finally, you could edit your ~ldm/etc/pqact.conf file and be more
  explicit about where to find decoders.  For instance, if pnga2area
  can be fund in the ~ldm/decoders directory, you could change
  pnga2area invocations from:

        PIPE    -close
        pnga2area ...

  to

        PIPE    -close
        decoders/pnga2area

The choice of tactic is up to you, but I would recommend that the last
option might be the easiest to live with since pqact.conf entries do
not usually need to be changed when installing new LDM releases.

Tom Yoksas

>From address@hidden Mon Feb 12 13:31:24 2001
>Subject: RE: 20010212: ldm - low priority 

Tom,
   Thank you. I stole the pqact.conf files from unidata and made only
a few changes. I see that decoders/Xdecoder is what the other decoders
use so I did that change. If this doesn't work I will let you know.

Fred

NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.