[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

19990302: Scouring McIDAS XCD data files



>From: "Gen. McIDAS" <address@hidden>
>Organization: UVa
>Keywords: 199903022316.QAA06783 McIDAS LDM scour

Jennie,

>I am having trouble that I attribute to overzealous scouring.  I am 
>trying to sort through  what the difficulty is and am getting confused
>by the two systems available for scouring incoming data
>(the mcscour.sh shell versus the ldmadmin scour utility).

I recommend using mcscour.sh for scouring McIDAS data files since it
gets around a problem which you refer to later in this email.

>We
>had been having problems with running out of disk space, and
>so I knew I needed to change something....there did not appear to
>be any scouring working, and I didn't know why, but I was removing
>files by hand for few days.

The best way to run mcscour.sh is from a cron entry from the 'mcidas'
account.

>Finally, this week I went in and 
>changed the scour utility.  The problem is that now
>I don't have *any* MD files present in the xcd directory.

I would guess that this is a result of the LDM's scour utility deleting
SCHEMA from the XCD decoder output directory.  Without SCHEMA, the
decoders won't know the "shape" of the output files that are to
be created, so they will fail.

>The ldm seems to be running fine, and the decoders are running
>so I wonder if I might have clobbered some things with the
>scour.

Yes.

>The reason I think this is that one of the lines was to 
>remove any files older than two days from the data directory 
>(/incoming/data/xcd), but there was no extension, and so it was
>like a generic rm * (sigh, here we go again....with problems of
>our own making).  I guess I hadn't thought about what files might 
>have been in there that I *didn't* want removed. 

Right.  The LDM's concept of scouring is strictly by age of the files.
McIDAS' idea of scouring is by age, but only for data files.

>The scour is run from ldma's cron, and it had been commented
>out.  I edited the scour.config file and then uncommented the
>line in the cron, and since then (at least that is when I
>think I screwed things up) the MD files have not been pulled
>in.  I wish I could just figure out what I clobbered, but I guess 
>I need help.

The files that you will need to re "install" in the directory are:

SCHEMA
IDXALIAS.DAT

You may have to recreate the *.RAT and *.RAP files IF they were deleted.

The first file to address is SCHEMA.  This should be as simple as copying
the version of SCHEMA in the distribution (~mcidas/data/SCHEMA) to the
directory.

The second is IDXALIAS.DAT.  This is created by WMORTE if it doesn't already
exist.  From a session being run as 'mcidas', run:

WMORTE  LIST DDS

Since you should already have a REDIRECtion for IDXALIAS.DAT in 'mcidas'
session pointing to the XCD output directory, it should be created
with no problems.

>Before I made this change, the only scouring we had been doing was 
>from the mcscour.sh.  Initially this didn't make sense to me,
>I didn't understand how it was looking at both the xcd and mcidasd
>data directories.  Now (too late of course) I realize that it was looking 
>at the xcd directory through the default redirections, and then seems to
>run a redirection on the fly from a BROADCAST.BAT batch file.

Right.

>So it should have been scouring just fine using mcscour.sh.

Right again, but we have seen instances of where a cron initiated mcscour.sh
in the 'ldm's account fails for an unknown reason.  This is why I noted
above that it is best to run mcscour.sh from the 'mcidas' account.

>I still don't know 
>*why* it had  stopped running before (by "stopped running" I mean we 
>definately had no scouring occurring for several days....).  But, I
>should have pursued that problem, instead I guess I created a
>new one, since I am  pretty sure that I "scoured" some files I seem 
>to need. 

I am pretty sure that we had run into this scouring problem before on
your system and solved it.  I guess the questions to ask are:

o which user's cron is running mcscour.sh
o do you have scour logging enabled in mcscour.sh

In order for logging to take place, you will need to have defined MCLOG
in mcscour.sh (typically set it to point to /home/mcidas/workdata/scour.log)
and have lines that look like:

# Send all textual output to the log file
exec 2>$MCLOG 1>&2

The first thing to do in getting the McIDAS scouring to work is to (this
will be overly verbose so that the information will be in the tracking
system for future reference by others):

o setup a cron entry for 'mcidas' (you may have one for 'ldm'; we have
  run into instances where this doesn't work properly on some people's
  systems (reason ??))

o make sure that you have logging enabled (see above)

o copy SCHEMA from ~mcidas/data to the directory

o stop the LDM

o remove _any_ MD files in the XCD output directory (they may be hosed up)

o restart the LDM

o verify that the MD files are once again being created

>Now I'm stuck and feeling dumb and hoping help is available.  

This shouldn't take a real long time to fix up, but if you have problems
let me know.

<later>
Chiz and I started chatting about why the mcscouring is not working out
of ldma's cron.  He questioned whether or not ldma had 'rm' aliased
to 'rm -i'.  I logged onto windfall and note that you _do_ have this
alias in place.  This may be the cause of mcscouring failure for the
*.XCD and *.IDX files in /incoming/data/xcd.  Since the same alias is
setup for 'mcidas', mcscour.sh might fail for it as well.  This should
not, however, have effected scouring of MD and GRID files.

While on windfall, I noted that the copy of SCHEMA in /incoming/data is
zero length, so it was scoured by LDM's scour.  I took the liberty
of copying the one from the McIDAS distribution to /data/incoming/xcd.
Right after I did this, MD files started being created again:

windfall: /incoming/data/xcd % ls -l MDXX*
-rw-rw-r--   1 ldma     mcidas     45016 Mar  2 20:00 MDXX0012
-rw-rw-r--   1 ldma     mcidas     16640 Mar  2 20:00 MDXX0022

This left the problem of the missing IDXALIAS.DAT, so I ran WMORTE
from the command line:

cd ~mcidas/workdata
wmorte.k LIST DDS

and verified that it was recreated:

windfall: /incoming/data/xcd % dmap.k IDXALIAS
PERM      SIZE LAST CHANGED FILENAME     DIRECTORY
---- --------- ------------ ------------ ---------
-rw-     24804 Mar 02 20:03 IDXALIAS.DAT /incoming/data/xcd
24804 bytes in 1 files

I noted that no surface MD file had been created even though it was already
5 past the hour, so I knew I needed to stop and restart the LDM.  Right
after I did this, a see that surface data is once again being decoded:

windfall: /incoming/data/xcd % mdu.k LIST 1 100
  MD#  CREATED SCHM PROJ  NR   NC     ID   DESCRIPTION
 ----- ------- ---- ---- ---- ---- ------- -----------
     2   99062 ISFC    0   72 4500   99062 SAO/METAR data for   03 MAR 1999
    12   99062 IRAB    0    8 1300   99062 Mand. Level RAOB for 03 MAR 1999
    22   99062 IRSG    0   16 6000   99062 Sig.  Level RAOB for 03 MAR 1999
 -- END OF LISTING

While I was at it, I checked ~mcidas/bin/mcscour.sh to make sure that
scour logging was setup; it was, so I left the file alone.  We need to
check the scouring log file, /home/ldm/logs/mcscour.log to see 
why scouring fails.

Talk to you later...

Tom

>From address@hidden  Wed Mar  3 09:15:26 1999
>First, many thanks for fixing things up!

re: recommend running mcscour.sh from 'mcidas' cron
>Right, I do know this, but, in my defense, mcscour.sh wasn't working
>at the time I ran the scour utility.

re: where mcscour.sh is running now
>Actually, at the moment, it is running from a cron entry in ldma.

re: SCHEMA was deleted
>Okay, I knew it was something like this....

re: LDM's scour
>Actually, I want to correct something I said above...I did not
>uncomment the line in cron, it was theoretically running, however,
>when the scour.conf file had been setup, the directory for the
>incoming data was prepended with a ~, which made it point to
>a directory that doesn't exist.  But, I edited the scour.conf
>file to make it look at the right directories, and *then* I ran
>it by hand (ldmadmin scour).  I realized after reading your
>message last night that the scour entry was still in cron (a command to
>run ldmadmin scour), but....for reasons I still don't understand,
>this cron entry was not actually doing anything ?

>I edited the scour.conf file to comment out the scouring and I will
>remove the ldmadmin scour  entry from cron, but I don't understand
>why it didn't run anyway?

re: run mcscour.sh from 'mcidas's cron
>So I should edit both cron jobs today and move the mcscour.sh  command
>(/home/mcidas/bin/mcscour.sh, which currently runs at 20:55) to the mcidas 
>account.

re: scouring history
>Hmmm, glad someone remembers whats up with our system :-(.  Look, I do recall
>we have had difficulties before, but the scouring had been working, and then
>it stopped.  I am pretty sure that I looked at the scour log file and it was
>old, this only told me that that mscour.sh wasn't running (which was pretty
>obvious)....and I didn't know why, but I needed to make space because the 
>disk was full....

re: which user's cron is running mcscour.sh
>ldma

re: do you have scour logging enabled in mcscour.sh
>yes

re: define MCLOG in mcscour.sh
>Right, its set to MCLOG=/home/ldma/logs/mcscour.log, since its running
>under ldma at the moment. 

re: aliased rm
>Well, I know that for some things in cron we use the /usr/bin/rm rather
>than just rm, and this seems to override our aliasing of the command rm
>(does this make sense?)  But, I don't see were rm commands are being
>invoked anyway?  We do have a few errors in the mcscour log, for having
>set up a couple of commands with bad syntax (using lwu.k DELETE rather
>than lwu.k DEL....but the error was for old VIRT files that we don't 
>even use anymore).

re: need to check scour log
>Right. 
>Thanks Tom! I appreciate your willingness to just fix it since I was home and 
>probably wouldn't done it from home last night, our University modem links 
>are pretty unreliable and you can get thrown off in the middle of important
>things.