[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030729: McIDAS: Cyclone Issues (Again!)



>From: Michael Keables <address@hidden>
>Organization: DU
>Keywords: 200307291613.h6TGDeLd024449 McIDAS-XCD LDM

Hi, Mike,

>I'm having problems again with cyclone...no data updates since June 18 (been
>away from the office for a couple of months.) I suspect it is once again a
>disk storage issue, but I've looked at the disk utilization and can't find
>the problem file(s)/directories. Would you mind taking a look at the setup
>again when you get a chance?

Sure.

>Sorry for the repeated calls for assistance,
>but I don't have any Unix support here at DU that I can call on for issues
>such as this.

I jumped onto cyclone this morning and found that the problem was that the
LDM was not running.

I logged in as 'ldm' and did the following:

- check the integrity of the LDM queue

% pqcat -s > /dev/null

  The output from this tells us if the LDM product queue is in an inconsistent
  state.  It wasn't.

- start the LDM with

% ldmadmin start

  this failed since the previous LDM instance was not shut down cleanly.
  I cleaned this up with:

% ldmadmin stop
% ldmadmin clean

  and then I started the LDM:

% ldmadmin start

The LDM started up with no errors and is busily ingesting data.  New McIDAS
datafiles are being decoded, so things appear to be working.

Also, I tried to take the opportunity to upgrade cyclone to the latest
LDM release, LDM-6.0.14, and I got most of the way through the
installation.  What I couldn't finish is the final step in the setup,
the step that needs to be done as 'root' (I used to have 'root's
password, but it has evidently been changed).  So, can you do the
following:

<login as 'ldm'>
cd ldm-6.0.14/src
su
  <pass>
/usr/ccs/bin/make install_setuids
exit

cd ~ldm
ldmadmin stop
rm runtime && ln -s ldm-6.0.14 runtime
ldmadmin start

That will get you up to the current release of the LDM.

>P.S. Can you give me a quick tutorial (or point me to one) on how to find
>the offending files/directories that fill up disk space? This seems to be an
>on-going issue with cyclone.

In looking through the log file for the previous LDM invocation,
~ldm/logs/ldmd.log.1, I see that your system _did_ run out of space:

May 15 00:07:06 cyclone.natnet.du.edu pqbinstats[9148]: fflush: No space left 
on device
May 15 00:07:29 cyclone.natnet.du.edu last message repeated 98 times

Since your 'ldm' crontab file is correctly setup to scour data, I am a
little at a loss to point a finger at one thing that may be a problem.
Given the volume of model data in NOAAPORT and the HDS stream, it could
be the case that you simply don't have enough disk space to decode all
of the model data.  _IF_ this is the problem, it will only get worse as
the NWS sends more and more model data through NOAAPORT.  Again _if_
this is the problem, then you will need to decide what model output
data you want decode, and what you can live without.  Before going
there, however, let's let the LDM run for a day and see if you run out
of space again.  If you run out of space in /data, then do the
following as 'ldm':

cd ~ldm/data
du -k .

This will show us where all of the disk space is going.

One other idea... your current LDM queue is 1 GB.  Since you are an IDD
leaf node and given what data you are ingesting,, your queue size could
be pared down to something like 500 MB.  This would give you an extra
500 MB of disk into which you can decode data which might be enough
right at the moment.  Here are the steps to take if you want to try
decreasing your LDM queue size:

<done as 'ldm'>

ldmadmin stop
ldmadmin delqueue

<edit ~ldm/bin/ldmadmin and change:

$pq_size = 1000000000;

to:

$pq_size = 500000000;

ldmadmin mkqueue -f
ldmadmin start

You would do all of this _after_ finishing the LDM-6.0.14 installation
I outline above and after changing the runtime link to point at
ldm-6.0.14.


One other thing you could do is buy more disk.  Even though your LDM
machine is not a PC (where disk prices are _really_ low), you can
probably add a new disk to your system for not too much money.

>Thanks. again.

No worries.

Tom