[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[LDM #JGZ-326819]: LDM - LDM is killing the system

Hi Angel,

Long time no hear!

> Institution: University of Miami
> Package Version: ldm 6.4.1
> Operating System: SuSE Linux 9.3 (x86-64)
> Hardware Information: 4 processor dual-core
> Inquiry: Problem #1: scour never completes
> Problem #2: when the LDM runs the machine is very unresponsive. I know this 
> is kinda vague 
> but that's as much as I know. They are saving a  pretty large subset of 
> available data and
> writing to a RAIDed disk on a 3ware  card.. Any hints where to look first?

Our experience with "home built" RAID systems (meaning a RAID built by adding a 
card and attaching hard disks) on Linux is NOT positive!  We have tried 
virtually every
file system available on the RAID (except GFS), and have been dissapointed with 
We have been told that RAID performance when using 3Ware cards is better, but my
experience working with Gerry Creager (address@hidden) on his 3Ware-based RAID
setup is not stellar.  Sources in NCAR claim that they get very good RAID 
with external boxes that appear like SCSI devices to the system.

The biggest performance problem occurs when one puts the LDM queue on the RAID
AND then write LOTs of files to it.  In a test on a Fedora Core 1 machine with 
a Promise TX2000 RAID
card, I found that putting a 2 GB LDM queue on the RAID would result in receipt 
time latencies
that rapidly ramped up to 1 hour.  When the queue was moved to a "local", ext3 
the latencies dropped to fractions of a second.  Gerry and I also noticed that 
the scouring
on his RAID was very sluggish, so much so that I investigated writing new scour 
in other scripting languages to see if I could minimize the problems.  I was 
successful in implementing scouring in Tcl, but not so much so that I can 
say that this is a "solution".  By the way, at the time of our collaborative 
testing Gerry's
machine was running Fedora Core 2 and is now running CentOS Linux.  It is a 
dual, hyperthreaded
Xeon (32-bit) machine with 4 GB of RAM.  The 2 TB RAID is built from multiple 
300 GB Maxtor IDE drives.

As a starting point, I recommend immediately moving your LDM queue off of the 
RAID _if_ it
is currently on it, and see if there is a noticable improvement.

By the way, Steve says hi and asks how things are going in Miami!


Unidata User Support                                    UCAR Unidata Program
(303) 497-8642                                                 P.O. Box 3000
address@hidden                                   Boulder, CO 80307
Unidata HomePage                       http://www.unidata.ucar.edu

Ticket Details
Ticket ID: JGZ-326819
Department: Support LDM
Priority: Normal
Status: Closed

NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.