[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20020401: ** Env variables for ldmfail and ldm at boot



>From: James T Brown <address@hidden>
>Organization: Michigan State University
>Keywords: 200204011731.g31HVHa18383 LDM ldmfail cron

Jim,

>I have encountered what is hopefully a simple problem.
>Recently, I have added some of the "NAWIPS" decoders
>to my "pqact.conf" file and everything seemingly works
>properly until "ldmfail" is invoked.  There have been
>some apparent connection problems recently which have
>brought the following problem to light and even after
>browsing your support archives, I am still having some
>troubles.
>
>All the decoders and LDM processes are setup on my system
>to run as user "ldm".  No problems normally, but I notice
>if "ldmfail" is invoked by "cron" and new "ldmd" processes 
>are started, the "NAWIPS" decoders fail to start.  Apparently
>there is some trouble with the initial environment settings:
>
>> ld.so.1: /soft/nawips/bin/sol/dcmetr: fatal: libF77.so.4: open failed: No su
> ch file or directory
>
>If I run the "ldmfail" command from the command-line as
>user "ldm", there are no troubles,

Here is what is going on.  Cron invocations are run using the Bourne shell
AND the set of environment variables that are in effect is minimal.
The invocations do not pickup or use the settings in the LDM user's
shell definition file(s) (typically .cshrc since most users setup the
'ldm' account to use the C shell).

What you need to do is force the cron-invoked ldmfail to set environment
variables that are needed to run.  From your description above and further
down in this email, this will include setting of PATH and LD_LIBRARY_PATH.

>but the appropriate
>paths are not being setup when it is invoked using the
>"crontab" entry:
>
>> # 
>> # Failover Test
>> #
>> 0,20,40 * * * * bin/ldmfail -p "sunset.meteor.wisc.edu" -f "squall.atmos.uiu
> c.edu"
>
>
>I saw reference to using "source .cshrc" within the "crontab"
>entry and that didn't seem to work for me either.  

You would have to have the cron entry run ldmfail from the C shell,
otherwise you would not be able to source .cshrc since the cron
invocations are run from the Bourne shell.

>Most of my "ldm" environment variables are set using a global
>resource file from within its own ".cshrc" file:
>
>> % more ~ldm/.cshrc
>> 
>> # if possible, source system-wide cshrc file
>> #
>> if ( -e /usr/local/lib/csh.cshrc ) then
>>      source /usr/local/lib/csh.cshrc
>> endif
>
>
>Again, if user "ldm" runs "ldmfail" from the command-line,
>everything is fine as all the environment variables and
>library paths can be located.  Can you help me with the
>proper method that is needed to setup my cron job so the
>settings stored in my (C-shell) resource files are read
>properly by "ldmfail"?

One way to do this is to edit ~ldm/bin/ldmfail and have it setup
the PATH and LD_LIBRARY_PATH values that are needed to run.  The
other way is to set the environment variables in the cron entry
itself.  This would be something on the order of:

PATH=<fill in the needed path as a colon-separated list> LD_LIBRARY_PATH=<fill 
in the needed LD_LIBRARY_PATH as a colon-separated list> ldmfail

Since a new version of ldmfail will get installed each time you upgrade
your LDM, you may try a different approach:  create a shell script
that does the work of setting needed environment variables and then
runs ldmfail; and run this shell script from cron.

>Also, a related problem occurs when "ldm" is started at
>boot time using the following start-up script:
>
>> #! /bin/sh
>> # $Id$
>> #
>> PATH=/bin:/usr/bin:/usr/etc:/usr/ucb; export PATH
>> LDMHOME=/usr/local/ldm
>> LDMBIN=$LDMHOME/bin
>> MANPATH=$LDMHOME/man
>> DECODEBIN=$LDMHOME/decoders
>> UTILBIN=$LDMHOME/util
>>
>> case "$1" in
>> 
>> 'start')
>>         if [ -x $LDMBIN/ldmadmin ] ; then
>>                 PATH=$PATH:$LDMBIN:$DECODEBIN:$UTILBIN:/usr/local/bin; expor
> t PATH
>>                 MANPATH=$MANPATH; export MANPATH
>>                 echo "starting $LDMBIN/rpc.ldmd using ldmadmin start."
>>                 /bin/su - ldm -c "$LDMBIN/ldmadmin delqueue"
>>                 /bin/su - ldm -c "$LDMBIN/ldmadmin mkqueue"
>>                 /bin/su - ldm -c "$LDMBIN/ldmadmin start"
>>         fi
>>         ;;
>
>
>When the above is started at boot time, all is well until
>the "NAWIPS" decoders try to run.  The libraries once again
>can't be located:
>
>> ld.so.1: /soft/nawips/bin/sol/dcmetr: fatal: libF77.so.4: open failed: No su
> ch file or directory

I am guessing that the reason for this is that the libF77.so.4 library is
not installed in the "standard" place on your machine.  If this is the
case, the OS has no way of knowing where to find the library, so the
invocation of an executable that needs the library will fail.  The
solutions are one of two things:

o "install" the library in the "standard" location.  For Sun Solaris,
  this would be in /opt/SUNWspro/lib
o define LD_LIBRARY_PATH along with LDMHOME, etc. in the script that
  is being used for LDM startup at boot such that the shared Fortran
  can be found:

PATH=/bin:/usr/bin:/usr/etc:/usr/ucb; export PATH
LDMHOME=/usr/local/ldm
LDMBIN=$LDMHOME/bin
MANPATH=$LDMHOME/man
DECODEBIN=$LDMHOME/decoders
UTILBIN=$LDMHOME/util
LD_LIBRARY_PATH=<set your LD_LIBRARY_PATH as a colon-separated list>

and in the startup itself:

'start')
        if [ -x $LDMBIN/ldmadmin ] ; then
                PATH=$PATH:$LDMBIN:$DECODEBIN:$UTILBIN:/usr/local/bin; export 
PATH
                LD_LIBRARY_PATH=$LD_LIBRARY_PATH; export LD_LIBRARY_PATH
                MANPATH=$MANPATH; export MANPATH
                echo "starting $LDMBIN/rpc.ldmd using ldmadmin start."
                /bin/su - ldm -c "$LDMBIN/ldmadmin delqueue"
                /bin/su - ldm -c "$LDMBIN/ldmadmin mkqueue"
                /bin/su - ldm -c "$LDMBIN/ldmadmin start"
        fi
        ;;

>If the above script is run as "ldm" from the command-line after the
>system has started, no troubles.

Because the necessary definitions of PATH and LD_LIBRARY_PATH are in place.

>The problem occurs only at boot
>time.  The paths and variables in my (C-Shell) resource file that
>are read (source) from within "ldm"'s .cshrc file are not being
>set.
>
>I have tried both:
>
>   /bin/su - ldm -c "$LDMBIN/ldmadmin delqueue"  
>
>and 
>
>   /bin/su ldm -c "$LDMBIN/ldmadmin delqueue" 
>
>
>and neither seem to work at boot time.
>
>
>I suppose I can possibly edit "ldmadmin" and/or "ldmfail" to 
>ensure the paths to my "NAWIPS" decoders are located, but I 
>would rather not.  It seems like there would be a better fix
>than that.  
>
>Do you have any suggestions?

See above.

>Do I need to use a "/bin/csh"
>start-up script at boot-time as opposed to "/bin/sh" so the
>"source" command can be used?

No.

>Just seems like there would 
>be a nice easy fix to this problem -  it is a pretty significant
>problem since the decoders no longer are able to start when 
>using "ldmfail" from with the "cron" entry.  I am likely
>overlooking something simple...
>
>My "ldm" user account and the ldm processes (5.1.4) are 
>running on a Solaris 2.7 (SPARC edition).
>
>Thanks,

Please let us know if the above doesn't get you going.

Tom Yoksas