[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20021110: dmraob.k hanging on weather2



>From: Gilbert Sebenste <address@hidden>
>Organization: NIU
>Keywords: 200210050242.g952g0127088 McIDAS-XCD DMRAOB DMSYN

Gilbert,


>From address@hidden Wed Nov  6 12:35:23 2002

>FYI...

>On all 3 of my machines, I installed a "rawhide" version of GCC from 
>Redhat. That is, it's the latest and greatest, dated October 28, but not 
>necessarily ready for the world. But I hope it is better than what I have 
>now!

>From address@hidden Sat Nov  9 11:02:56 2002
>Subject: McIDAS/glibc update

>I saw you recompiling McIDAS on weather2.admin.niu.edu. Just an 
>FYI...glibc 3.2.1-6 came out yesterday on RedHat's "rawhide" site. I 
>thought you should know a couple of things.

>1. I may have had a mixture of the latest glibc and older versions of it, 
>especially on weather and weather3. Regardless, on all of them, I stomped 
>on gcc and glibc this morning to include the very latest RedHat beta 
>versions. When I tried to install the new gcc and glibc, it wouldn't 
>go; it asked me for files it depended on, and I noted that some were 
>quite old. So, I told it to stomp on everything and install it brand new. 
>The system performance is somewhat better, although memory still runs out 
>within 6 hours. Is there a way the LDM and McIDAS can be optimized on 
>compilation to not hog all of the memory like that, or is this just a 
>Linux/gcc/glibc bug or poor way to handle memory? I know you've said the 
>latter, so is there a way around this problem?

>2. Weather3 had been crashing almost every day at 7:45 AM sharp. Since I 
>blew away the old gcc and glibc last week on that machine, it has stopped 
>doing that...for now. I installed a gcc/glibc patch for it this morning.

>3. The load average goes down when data is not coming in. This hasn't 
>happened before, and it's obviously much less stressful on the hard 
>drives.

>That's all for now. Hope this helps.

Gilbert

>dmraob.k is hanging on weather2 at 18Z on Sunday.

As soon as I saw this message (un Nov 10 20:19:11 GMT 2002), I logged
onto weather2 and stopped DMRAOB.  I setup the environment I mentioned
before so I could run it by hand and create a core dump.  The core
file shows that DMRAOB is hanging in the same system call that DMSYN
is:

#0  0x4207299c in _int_malloc () from /lib/i686/libc.so.6
(gdb) where
#0  0x4207299c in _int_malloc () from /lib/i686/libc.so.6
#1  0x42071b75 in malloc () from /lib/i686/libc.so.6

For reference, this was also the same problem that proftomd was experiencing.

There is no way that this should be happening, so I am at a loss for what
could be happening under Redhat 8.0 Linux with gcc3.2.

I want to upgrade your McIDAS version on weather2 to v2002 so I can find
out if there was any code modification rolled into it that would help
prevent this situation.  -XCD v2002 is running on a different user's
Slackware 8.1 Linux system (same kernel rev as yours) with no problems:
it has been humming along with no problems for about a week now.  The
big difference I see in your systems is:

o the Slackware dist is using gcc/g77 2.95.3 
o his system is not memory bound

So, if you are game, I will upgrade McIDAS on weather2 to v2002 and see
if that helps with the XCD decoding problems.  I am not hopeful about
this, but it is least something to test.

Tom

>From address@hidden Sun Nov 10 17:19:26 2002

Tom,

Go for it. Let's see what happens.

*******************************************************************************
Gilbert Sebenste                                                     ********
Internet: address@hidden    (My opinions only!)                     ******
Staff Meteorologist, Northern Illinois University                      ****
E-mail: address@hidden                                 ***
web: http://weather.admin.niu.edu                                      **
Work phone: 815-753-5492                                                *
*******************************************************************************