[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20030312: waldo at stc (cont.)

Subject: 20030312: waldo at stc (cont.)
Date: Wed, 12 Mar 2003 12:17:05 -0700

>From: "Anderson, Alan C. " <address@hidden>
>Organization: St. Cloud State
>Keywords: 200303111741.h2BHfFB2012143 McIDAS-X v2002 setup

Hi Alan,

>I came in this morning to find waldo with its disk full.  I was able to
>log in as root and deleted a couple of the largest files in
>/var/data/mcidas.  They were DD... .XCD files of over 400MB each.
>That gave a little breathing room.

The .XCD files each contain all of the textual data from your NOAAPORT
feeds DDPLUS|IDS|PPS for an entire day.  Scouring should be setup to
keep only one of these.

>Next, I did what probably should have come first.   Stopped the ldm.
>When this ran, I got the message
>
>product queue flushed to disk
>the ldm is terminating
>the ldm is terminating
>(repeated for several lines)

These are the new messages that ldmadmin emits when shutting down the
LDM.  Like the messages say, the memory mapped product queue is flushed
back to disk and then ldmadmin waits for all LDM processes to exit
before it returns you to the Unix command prompt.  LDM-5 did not
wait until all LDM processes exited, so we had to keep telling folks
to "wait until all LDM processes have exited" before restarting the
LDM.

>Did an  ldmadmin clean,  and then the ldm showed as being stopped.

The 'ldmadmin stop' should have done the cleanup that 'ldmadmin clean'
does, so the additional 'ldmadmin clean' was most likely not necessary.

>Disk use now down to about  50 %

That is probably due to your deleting of a couple of XCD created
files, and not be due to the LDM actually running.

>Noted that my ldm.pq was  just over  1 GB.

I noted that your LDM-5 ldmadmin was setup like this, and your LDM
queue was 1 GB before I installed LDM-6.  Given this, I did not
change anything.

>Ran  ldmadmin  delqueue
>and then  ldmadmin  mkqueue.   When it finished, the new ldm.pq seemed
>about the same size,  just over  1 GB.

That is right.  The script 'ldmadmin' contains the size of the queue
to create when running 'ldmadmin mkqueue'.  I set this to be 1 GB
because this is how your LDM-5 version of 'ldmadmin' was setup.

>Started the ldm and it seems to be running ok,   disk is now at about 
>50 %

Right, there was no change in disk space after deleting and remaking the
queue.

>Some questions
>
>Any idea what my problem is ?   Is the use of the new ldm (v6) and
>latest McIDAS (2002)  creating  considerably more and/or larger data
>files ?

The use of LDM-6 should have nothing to do with the size of data files
being created/kept on your system.  The LDM's job is to move data
to your machine and run applications you tell it to run (through
pqact.conf entries).

There is a possiblity that the new McIDAS-XCD version would result in
creation of larger decoded data files, but since you are not
decoding GRID data, this should only be a few tens of MB maximum.

>Disk size is only about  8 GB  but that has served us ok in the past.

I think I need to logon to waldo to snoop around.  If, for instance,
GRID data decoding got turned on, it is easy to understand why you
ran out of disk space.  A full day's worth of decoded GRID data in McIDAS
consumes upwards of 6.5 GB of disk.  I didn't even look at this on
waldo since you were not decoding GRID data before.

>Was it just a fluke of my queue being corrupted or some othe transient
>fault ?

From your comments, I haven't seen any indication that your queue was
corrupted.  Did you delete and remake the queue because of something
you saw, or because you were trying things to determine what could
be wrong?

>Do I need to reset my scour.conf files,

You shouldn't have to.

>and also the one that
>'prunes'  the some other files ( I think you set that up a while back
>when I was requesting a larger no. of image files to be retained.).

Again, you shouldn't have to since there was nothing new added to
your ingestion.

>I have not changed my cron file, and up to this morning,  disk was
>running at about  70 %

Right, I noted that when I did the LDM upgrade and remember this from
our previous discussions.

>Regarding the build of McIDAS 2002 on the other terminal (name cyclone)
>have just redone the env  changes and restarted the build  (after make
>clobber).  Will keep that issue separate.

OK, thanks.

>Thanks Tom

I will logon to waldo and see if anything looks amiss.  The only thing
I can think of is that your cron-initiated scouring did not get run
for some reason.  This would explain you having more than one .XCD file
on disk.

Oops, I just tried to logon (telnet) and was refused:

% telnet waldo.stcloudstate.edu
Trying...
telnet: connect: Connection refused

Can you turn this back on so I can troubleshoot?

Tom

Prev by Date: 20030311: HDF portion of v2002 build failure on Solaris 8 (cont.)
Next by Date: 20030313: some more on waldo's disk overflow of Wed. (cont.)
Previous by thread: 20030311: HDF portion of v2002 build failure on Solaris 8 (cont.)
Next by thread: 20030313: some more on waldo's disk overflow of Wed. (cont.)
Index(es):
- Date
- Thread