[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

19990319: ROUTE PP BATCH failures at UVa (cont.)



>From: "Jennie L. Moody" <address@hidden>
>Organization: UVa
>Keywords: 199903091926.MAA12825 McIDAS ROUTE SYSIMAGE.SAV

Jennie,

re: unsetting MCPATH statement not appearing when things are working
>I don't know if thats true.   Now, with the processing working
>(so far so good), the message of "unsetting MCPATH" is gone????

This has got to be a big clue somehow!

re: turn off verbose loggin
>Okay. later.

re: changing of permissions on /home/mcidas/workdata
>Right.  When I reread that last message, I noticed you didn't have
>to do this, you thought ldma couldn't write to the /home/mcidas/workdata
>but I saw you actually had tried to touch a file in /home/mcidas/data,

Actually, the 'touch' I did try was for /home/mcidas/workdata.  The
mistake was in the email I sent back to you.

>I don't think this was actually ever the problem (nice to know I'm 
>not the only one who makes mistakes)

Mistake is my middle name :-(

re: mcscour.sh has deleting of ROUTEPP.LOG
>Okay, forgot to notice that.

re: add writing message in batch.k before 'mcenv' invocation
>Okay, I'll try adding this.

This will prove/disprove the theory that GU core dumping is causing mcenv
to die.

re: another user using /home/mcidas/workdata
>I checked for this  and didn't find any problem.  Could that have
>happened if we had something like a power outage and a remote user
>was thrown off (windfall has an uninterruptable power supply, but
>aerial doesn't, and I know one day I was knocked off in such a manner...

Could be, but I am not sure.

>I do not recollect whether I had actually had been running mcidas, but
>I am sure I had been logged into windfall....so its possible.....) 
>This was back in February (which if I recall was when those lingering
>segments were time-stamped

re: make sure to check all directories in 'mcidas' MCPATH using dmap.k
>OK.

re: GU dumping causing mcenv to die
>bummer.

The correct response is "bummer, dude!"

re: other's sessions
>I looked at everyones path...they all seem fine. Unless they are running
>some process that resets their path?  Don't know what that might be...

OK.

re: think of anything else

The setup I have been recommending is a little different than the one
we use here at the UPC, but the difference should really only affect
XCD decoding.  Here is what I did:

o create a /home/mcidas/upcworkdata directory and _copy_ (not link)
  the files from /home/mcidas/workdata to it

o edit xcd_run and change MCDATA to point to /home/mcidas/upcworkdata
  instead of /home/mcidas/data

I did this to obviate the possible impact of my building and reinstalling
McIDAS-X,-XCD frequently (testing, betas from SSEC, multiple platforms,
etc.).

My ROUTE PP BATCH processing, on the other hand, still uses 
/home/mcidas/workdata as MCDATA.  So, the upshot of the difference would
be that ROUTE PP BATCHing would be separated from XCD processing.  While
I don't think that this should have any salutory effects, it may.  Since
it is spring break for you, you might want to give this a shot.  If you
decide to do this, make sure to stop the LDM before making the changes
(making the new directory; copying the files to the new directory; and
editing xcd_run that is being used by the LDM) and start it afterwards.
Of course, you will have to insure that the read/write permissions on
the new directory and files are such that 'ldma' can read/write.

>Sorry to have become a pain in the ass.  Thats how it feels on this end.

No problem.  This would be more interesting to me IF I wasn't working
so hard to get a new distribution out of the door.

Tom

>From address@hidden  Sat Mar 20 14:41:10 1999

>re: mcidas.log message must be a clue
Struck me that way....., so when post-processing dies, the mcidas
decoders (not the xcd-decoders mind you) give this unique message
in the mcidas log. ?

> re: turn off verbose loggin
I haven't done this yet, maybe I'll wait to see if it
happens next time (arghhh)

>re: I actually checked /home/mcidas/workdata for write
Are you sure then, because it seemed really strange to me that ldma
couldn't write to that directory (recall it *had* been writing to
that directory previously) and ldma and mcidas are in same  group,
etc....it really made no sense.

> re: another user using /home/mcidas/workdata
actually, what I wrote below didn't have anything to to do 
with another user using /home/mcidas/workdata, it was
my response to your wondering if we need some kind of
shared memory patch (I was wondering if that kind of crash
could cause a process to die without releasing shared 
memory (I am not sure if this is the concept?)
 
>re: The correct response is "bummer, dude!"
I *hate* the phrase dude, I strongly discourage my sons from 
saying it.