[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20040727: LDM Decorder Problem (cont.)



>From: Sameka Cook <address@hidden>
>Organization:  Office of Internet Services
>Keywords:  200407231853.i6NIrsaW019058 LDM ldmadmin

Hi Sameka,

>I work for the National Weather Service, I am running LDM on my web 
>farm.  I have two production servers that receive data.  One day last 
>week, I lost the root partition on one of the production servers.  I 
>reinstalled Linux and began fixing problems on the server.  When I began 
>to test LDM, I received many errors.  I have resolved most of the errors 
>but I could not resolve the errors I mentioned in my first email.

Your first email asked about the output from 'ldmadmin check'.  What
we need to help troubleshoot problems you are seeing is representative
examples of error messages from ~ldm/logs/ldmd.log.

>Over 
>the weekend I deleted the queue and created a new queue.  This  fixed 
>the errors I stated in the first email but I continue to experience 
>those same error if I reboot the server.

Are you rebooting the server without first shutting down the LDM?

>If I manually try to start 
>ldmadmin after a reboot I get the following error: Queue corruption has 
>occurred. ldmd.pid file exists.

It does indeed sound like you are rebooting your machine without
shutting down the LDM, or without a shutdown process being installed
in /etc/init.d.  More on this below.

>Now, by deleting the queue I am assuming the ldmd.pid is deleted

This is not the case if you simply remove the queue using something
like 'rm data/ldm.pq'.   If you use 'ldmadmin delqueue' followed
by 'ldmadmin mkqueue', then yes, ldmd.pid should be removed.

>and I 
>can restart ldmadmin manually.  I tested this by deleting the queue, 
>then starting ldmadmin manually...LDM working fine.  Reboot the server, 
>and again received the error as I figured I would after rebooting.  I 
>browsed to the /user/local/ldm directory and deleted the ldmd.pid file 
>and started ldmadmin without error.  I also noticed after rebooting the 
>server the rpc.ldmd services are not running. 
>
>So, the new question is, is there a startup script  that kicks off 
>rpc.ldmd and deletes the ldmd.pid file?  Again, any assistance would be 
>greatly appreciated. 

It sounds like you have not installed an LDM startup/shutdown script
in /etc/init.d.  The LDM web page:

Unidata HomePage
http://my.unidata.ucar.edu
  LDM
  http://my.unidata.ucar.edu/content/software/ldm
    LDM-6.0.14
    http://my.unidata.ucar.edu/content/software/ldm/ldm-6.0.14/index.html
      LDM Basics
      
http://my.unidata.ucar.edu/content/software/ldm/ldm-6.0.14/basics/index.html

In the Table of Contents shown on the last page, you will see the link:

Configuring an LDM Installation
http://my.unidata.ucar.edu/content/software/ldm/ldm-6.0.14/basics/configuring.html

Step 7 is:

7. Ensure that the LDM is started at boot-time 
http://my.unidata.ucar.edu/content/software/ldm/ldm-6.0.14/basics/configuring.html#boot

You should create, as 'root', the file /etc/init.d/ldmd with the
contents listed in Step 7.  You need to make sure that this script is
executable:

chmod +x /etc/init.d/ldmd

Since you are running under Linux, I encourage you to add a bit to the
beginning of /etc/init.d/ldmd:

#! /bin/sh

# chkconfig: 5 95 05
# description: Start the Unidata Local Data Manager (LDM)

After adding this to /etc/init.d/ldmd, you can use the Linux 'chkconfig'
utility to 'install' ldmd in the various run levels:

chkconfig --add /etc/init.d/ldmd

Once the script is installed, you can execise it to insure that there
are not errors/typos:

/etc/init.d/ldmd stop
/etc/init.d/ldmd start

After all is working, any clean shutdown of your system will result
in the LDM being stopped 'gracefully'.  Upon reboot, the LDM queue
existence and integrity is checked.  If the queue does not exist,
it is created.  If it exists, it is checked for problems.  If problems
are found, it is deleted and remade.  Finally, an 'ldmadmin clean'
is run followed by an 'ldmadmin start'.

>Thank you,

No worries.

>Sameka Cook
>Office of Internet Services
>301-713-1384 x 109
>address@hidden

Cheers,

Tom
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publically available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.