On Wed, 11 May 2005, Gabe Langbauer wrote:
During these periods ssh is completely unable to connect to the
machine...I do log ps -eaf and free, although I think those have been
written over since this most recent crash. The free command shows that
most of the memory is used, however according to some google searches this
is because at boot the kernel "takes" the memor and allocates it as
necessary...ps didn't show me anything...I've checked all /var/log
files and nothing jumps out there either.
What recent changes have been made to the system? Am I correct in assuming
that the system worked fine recently until you made some change? Was it just
starting up the gempak scripts that caused this? Have there been any system
upgrades or packages or hardware added? Is it just the one gempak script that
causes a problem, and if so, if you run only portions of that script, does the
problem go away? I.e., what portion of the script is causing the problem?
It may be unrelated, but I had a system (actually, I still have it) that has a memory leak in some
I/O driver that jams the system over time. In such a case, you will see the "-/+
buffers/cache used" column of "free" increase steadily with time until the system
hangs, at least that's what mine does.