Re: [ldm-users] bigbird

I recently had an interesting discussion with Mike Schmidt about large numbers of cores and LDM. It sounds like the real key isn't core-count, but memory for the queue, and not (ever) having to swap that off to disk. I'm wondering if you're large queues are going to disk for virtual, and hanging, because of the core count? Just thinking out loud.

I'll keep you posted!

gerry

Pete Pokrandt wrote:
Gerry,

Let me know how it goes for you running Scientific Linux 6.0. I recently tried it on the replacement for idd.aos.wisc.edu, and found that with a large queue, the ldm would stop responding after a certain amount of time. It seemed to work with smaller queue sizes (less than half the size of my physical RAM) but if I went bigger than that, it would always hang. One of the ldmd processes would peg at 100% of one CPU and no data would flow after that.

There's a back and forth between me and Unidata support over at http://www.unidata.ucar.edu/support/help/MailArchives/ldm/maillist.html (subjects are "New IDD relay - ldm is hanging after some time") It looked like it was hanging somehow in the glibc library while doing I/O.

I backed it out to CentOS 5.6 and for now it seems to be stable.

The new server has dual Opteron 6128 processors (2x8 cores), 32 Gb of RAM and dual 300 Gb SAS drives. Any queue 16 Gb or bigger and I ran into trouble.

Pete



On 6/25/2011 8:08 AM, Gerry Creager wrote:
bigbird is back and should be working. I had one config error yesterday (firewall) which is corrected. Apologies for the inconvenience.

If you see problems or anomalies with bigbird, please let me know.

Background:
bigbird has been acting a bit unstable of late. It was also running CentOS 4.8 and I was having problems getting security patches on it. When I had problems Thursday evening and again Friday morning, it was time to update the OS.

I made the switch from CentOS to Scientific Linux. SciLinux is supported by Fermi Lab, with a full-time staff, and is on the same lines as CentOS: A free (as in beer) version of RHEL. Installation from DVD went very well, and a base server install was about half the time as for CentOS based on what I recall. Like so much else, it does start SELinux in enforcing mode by default but this is an easy first-boot fix.

CentOS appears to be suffering some community fragmentation and strife. RHEL 6.1 is out but CentOS hasn't released a v6.0 yet. SciLinux was fast off the mark to get their 6.0 out. I'm a little concerned about the future of CentOS.

gerry


--
Gerry Creager -- gerry.creager@xxxxxxxx
Texas Mesonet -- AATLT, Texas A&M University
Cell: 979.229.5301 Office: 979.458.4020 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843