[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[IDD #QOR-886108]: SOS (linux machine)



Hi Luis,

re: extreme slowness on yogui

> This is the first time I am experiencing this sort of problem.
> I did some "research" and the data disk (/mnt/disk2/data) may
> not be the problem. I believe that this first line in the
> following list is a kew issue:
> Filesystem           1K-blocks      Used Available Use% Mounted on
> /dev/mapper/VolGroup00-LogVol00
> 226129108 210744616   3712532  99% /
> /dev/sda3               194449     14279    170130   8% /boot
> tmpfs                  1891904         0   1891904   0% /dev/shm
> /dev/sdb1            473086160 163854572 284812388  37% /mnt/disk2
> /dev/scd0               389134    389134         0 100% /media/060311_1049
> [farfan@yogui /]$
> 
> Since last Friday, it has been reaching full capacity (100%) and
> this tends to be after 10AM. Then the machine is slow and not
> able to perform several scripts.

I logged onto yogui and saw that / was 100% full and the load average was
right around 19 !  I did some investigating and found that the 
/var/spool/clientmkqueue
directory was filled with files dating back to 2008.  I deleted most of the 
files
in the directory but left a few for you to take a look at.

Deleting these files freed up a good bit of disk space, but it did not go far 
enough
to make enough room available for the system to run smoothly; the disk usage is 
still
at 100%, but things are now running again.

I did some more looking and found that the /home/gempak and /home/farfan 
directories are
using the majority of disk space in /.  Since I did not know which files could 
safely be
deleted in these directories, I left them alone.  It is most important that you 
look at
the disk space being used in each of these directories and delete as much as 
possible as
soon as possible.  I will talk with our system administrator (Mike Schmidt) 
tomorrow
to learn more about the files being saved in the /var/spool/clientmkqueue 
directory.

Comment:

- during my looking around, I stopped the LDM and killed a 'ps' process that had
  been running for greater than 2700 hours.  After deleting a lot of the files
  in the /var/spool/clientmkqueue directory, the load average dropped from 19
  down to less than one.  I took this to mean that the system was no longer
  thrashing while trying to get enough disk space in / to do system related
  things like logging.

  As I write this note, I have turned the LDM back on, and data is once again
  flowing to yogui and being processed.  I take the fact that the load average
  is staying low as a good sign, but I can't be 100% sure because I don't know
  all of the processing that is supposed to be happening on the machine.

Please take a hard look at the disk space being used by gempak in /home/gempak
and by yourself in /home/farfan.  If you can clean-up a significant fraction of
the disk space being used (don't forget you can move some of the stuff in
these directories to /mnt/disk2), yogui should continue to run smoothly.

Cheers,

Tom
--
****************************************************************************
Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
----------------------------------------------------------------------------
Unidata HomePage                       http://www.unidata.ucar.edu
****************************************************************************


Ticket Details
===================
Ticket ID: QOR-886108
Department: Support IDD
Priority: Normal
Status: Closed