[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

19990111: crontab...



>From: Maureen Ballard <address@hidden>
>Organization: UK Ag Weather Center
>Keywords: 199901112024.NAA08083

>Steve,
>
>I know this isn't exactly a gempak/ldm question but I am hoping you can
>answer it. We occasionally get the cannot allocate colors error while
>things are running. I know it has to do with too many things trying to
>access the color map. Just about everything we run is in a crontab. Do
>you know of any limits or suggested limits as to how many things should
>run in a crontab?
>
>We are more apt to get the error if something (say the models) did not
>run completely from the cron. Then we would manually run it from the
>command line. Are you aware of a way in which we can see what other
>programs or scripts are accessing the color map? Right now, we end up
>rebooting the machine to get rid of the error. Any help would be greatly
>appreciated.
>
>The machine we run gempak on is used solely for gempak along as a ftp
>site for some additional data to be dumped to it.
>
>Thanks!
>
>Maureen
>

Maureen,
If you have a script that fails to exit properly, it can leave the
gplt and gf processes running. Typically, scripts exit with "gpend"
so the gf driver closes down and frees the colormap, but if the script
dies or the program crashes for some reason, then gpend wont get called.
Since the gplt session is uniqueue to the crontab process that creates it, 
subsequent crontab runs can't get to the message queue to kill it.

When you find the color allocation failure, see if any "xw" or "gf" 
processes, as well as "gplt" processes are still in the process table.
You should be able to kill them, and remove the message queue's with
the "ipcrm" command (use "ipcs" to see the message queues in use).

One way around the need for an open X server is to use the X11R6
Xvfb (virtual frame buffer) which creates an X session in memory
for your program to draw to, rather than the console- which may
have more programs like Netscape and the window manager which compete
against it for the color map.

If you find your server in the state you describe above, it may be
helpful if you would send me the output from your "ps" process
status for all processes, as well as the output from "ipcs".
If one script in particular seems to be hanging things up
for the other scripts, then you might look at that script for a bug.

When I have written web generation crons, I usually have my launcing
script first run "uptime" to check the load on the system, and if
it is too large, I either exit out, or wait for the system
load to come down. Then, if I have processes that require having
the screen to themselves, I usually use the "touch" command to
touch a .lock.processid file, then wait in a loop until the .lock
file is the first in line- then remove the lock file when exiting.
Any waiting scripts that are periodically checking to see if their lock 
file is first will then recheck to see if they can go.

I usually put time outs in, where scripts that wait longer than a certain
period will eventually die, so that I don't end up with too many processes
waiting around if one of the scripts hangs or doesn't remove its lock.
When a cron process aborts like this, I generally have it email me
with a warning message.

Steve Chiswell