[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

19990208: sflist core dump



Maureen,

I have been looking at your data file, as well as all of the data 
being decoded for the past three days. With your data file, I
can produce the core dump, though I can't make reproduce
a file here with the past 3 days data to cause a core dump.

The problem with your file is a corrupt record, that behaves
as if the writing to the file was interrupted.

Would it be possible that the disk partition you are decoding
the data to could have filed up for some period? Log messages
about disk full would likely be in your /var/adm/messages
file on the system that is doing the decoding. Since the same 
thing has happened for 3 days in a row, it could be that recently 
a disk space problem has arisen.

One other thing would be the possibility of a second dchrly decoder
running which would create conflicting writes to the file. This
should not happen if there is only one dchrly entry in your pqact.conf
file, and it is properly using the file naming templates. If
you have any doubts, send me your pqact.conf file and I'll look.

If dchrly has crashed at all by the ldm during this time, it would
be good to know that- I haven't seen any crashes here.

I ran sfdelt with dattim=/0200 and area=dset and was able to remove 
all the 2Z observations but found another corrupt record at 5Z.
After clearing that, I was able to list the rest of the contents of the
file.


Steve Chiswell





>From: Maureen Ballard <address@hidden>
>Organization: UK Ag Weather Center
>Keywords: 199902081828.LAA17365

>Steve,
>
>The file is there as 990208_sao_2.gem
>
>In case you are wondering, I wasn't paying attention at first and started to s
> end
>the file as an ASCII file - that's the file called 990208_sao.gem
>
>Thanks!
>Maureen
>
>Unidata Support wrote:
>
>> >From: Maureen Ballard <address@hidden>
>> >Organization: UK Ag Weather Center
>> >Keywords: 199902081729.KAA15779
>>
>> >Steve,
>> >
>> >Our meteograms stopped running last night and while I was diagnosing the
>> >problem this morning, I kept getting core dumps. What I do is list the
>> >data from yesterday and today using sflist and place the output into a
>> >file. Then I combine the file and create a gempak management file from
>> >that. That way, every time the program runs, a 24 hour meteogram is
>> >created.
>> >
>> >The core dump occurs while I do an sflist on today's data. Since it is
>> >today's data, it would likely occur even if I wasn't running the sflist
>> >program. When I list data from /0200, the data core dumps. I am guessing
>> >that there is some "garbage" in the data file sometime between 0200 and
>> >0300 (i can list 0300 fine).
>> >
>> >I tried using sfdelt to delete all of 0200 from the file (so at least
>> >the program could get through everything without core dumping) but that
>> >doesn't seem to help. There are still some 0200 times in the file. Any
>> >ideas? Even better would be ideas as to why this happened and how it can
>> >be prevented.
>> >
>> >Thanks
>> >
>> >Maureen
>> >
>> >--
>> >========================================================================
>> >
>> >Maureen Moore Ballard                   address@hidden
>> >Staff Meteorologist                           ph:  606-257-3000ext244
>> >Ag. Weather Center                         fax: 606-257-5671
>> >243 Ag. Engineering Bldg
>> >Dept. of Biosystems and Ag. Engr.
>> >University of Kentucky
>> >Lexington, KY 40546-0276
>> >HOMEPAGE   http://wwwagwx.ca.uky.edu
>> >
>> >You don't stop laughing because you grow old;
>> >you grow old because you stop laughing.
>> >=========================================================================
>> >
>> >
>> >
>>
>> Maureen,
>>
>> I don't have any core dumps with today's surface file. Can you ftp me your
>> file so I can see if I can track the problem down? Stick it in
>> ~gbuddy/incoming.
>>
>> Steve Chiswell
>> ****************************************************************************
>> Unidata User Support                                    UCAR Unidata Program
>> (303)497-8644                                                  P.O. Box 3000
>> address@hidden                                   Boulder, CO 80307
>> ----------------------------------------------------------------------------
>> Unidata WWW Service                        http://www.unidata.ucar.edu/     
>> ****************************************************************************
>
>
>
>--
>========================================================================
>Maureen Moore Ballard                   address@hidden
>Staff Meteorologist                           ph:  606-257-3000ext244
>Ag. Weather Center                         fax: 606-257-5671
>243 Ag. Engineering Bldg
>Dept. of Biosystems and Ag. Engr.
>University of Kentucky
>Lexington, KY 40546-0276
>HOMEPAGE   http://wwwagwx.ca.uky.edu
>
>You don't stop laughing because you grow old;
>you grow old because you stop laughing.
>=========================================================================
>
>
>From address@hidden  Tue Feb  9 07:30:00 1999
>Received: from smtp.uky.edu (smtp.uky.edu [128.163.2.17])
>       by unidata.ucar.edu (8.8.8/8.8.8) with ESMTP id HAA14608
>       for <address@hidden>; Tue, 9 Feb 1999 07:29:59 -0700 (MST)
>Keywords: 199902091429.HAA14608
>Received: from pop.uky.edu (pop.uky.edu [128.163.2.16])
>       by smtp.uky.edu (8.8.8/8.8.8) with ESMTP id JAA18149
>       for <address@hidden>; Tue, 9 Feb 1999 09:29:58 -0500 (EST)
>Received: from byron.ca.uky.edu (byron.ca.uky.edu [128.163.192.2])
>       by pop.uky.edu (8.8.8/8.8.8) with ESMTP id JAA16409
>       for <address@hidden>; Tue, 9 Feb 1999 09:29:58 -0500 (EST)
>Received: from ca.uky.edu ([128.163.193.88]) by byron.ca.uky.edu (8.8.0/8.8.0)
>  with ESMTP id JAA08054 for <address@hidden>; Tue, 9 Feb 1999 09:28
> :11 -0500 (EST)
>Message-ID: <address@hidden>
>Date: Tue, 09 Feb 1999 09:29:56 -0500
>From: Maureen Ballard <address@hidden>
>Organization: UK Ag Weather Center
>X-Mailer: Mozilla 4.05 [en] (WinNT; I)
>MIME-Version: 1.0
>To: Unidata Support <address@hidden>
>Subject: core dumps again with sflist
>Content-Type: text/plain; charset=us-ascii
>Content-Transfer-Encoding: 7bit
>
>Steve,
>
>Wanted to let you know that we are having a similar problem again today.
>Right now we get a core dump sometime in the /0000 Z hour. Let me know
>if you want a copy of the surface file. This is really strange to happen
>2 days in a row.
>
>Thanks for the help
>
>Maureen
>
>--
>========================================================================
>
>Maureen Moore Ballard                   address@hidden
>Staff Meteorologist                           ph:  606-257-3000ext244
>Ag. Weather Center                         fax: 606-257-5671
>243 Ag. Engineering Bldg
>Dept. of Biosystems and Ag. Engr.
>University of Kentucky
>Lexington, KY 40546-0276
>HOMEPAGE   http://wwwagwx.ca.uky.edu
>
>You don't stop laughing because you grow old;
>you grow old because you stop laughing.
>=========================================================================
>
>
>
>From address@hidden  Tue Feb  9 11:15:19 1999
>Received: from smtp.uky.edu (smtp.uky.edu [128.163.2.17])
>       by unidata.ucar.edu (8.8.8/8.8.8) with ESMTP id LAA21792
>       for <address@hidden>; Tue, 9 Feb 1999 11:15:18 -0700 (MST)
>Keywords: 199902091815.LAA21792
>Received: from pop.uky.edu (pop.uky.edu [128.163.2.16])
>       by smtp.uky.edu (8.8.8/8.8.8) with ESMTP id NAA22867
>       for <address@hidden>; Tue, 9 Feb 1999 13:15:12 -0500 (EST)
>Received: from byron.ca.uky.edu (byron.ca.uky.edu [128.163.192.2])
>       by pop.uky.edu (8.8.8/8.8.8) with ESMTP id NAA00154
>       for <address@hidden>; Tue, 9 Feb 1999 13:15:11 -0500 (EST)
>Received: from ca.uky.edu ([128.163.193.88]) by byron.ca.uky.edu (8.8.0/8.8.0)
>  with ESMTP id NAA02228 for <address@hidden>; Tue, 9 Feb 1999 13:13
> :22 -0500 (EST)
>Message-ID: <address@hidden>
>Date: Tue, 09 Feb 1999 13:15:09 -0500
>From: Maureen Ballard <address@hidden>
>Organization: UK Ag Weather Center
>X-Mailer: Mozilla 4.05 [en] (WinNT; I)
>MIME-Version: 1.0
>To: Unidata Support <address@hidden>
>Subject: problems continue...
>Content-Type: text/plain; charset=us-ascii
>Content-Transfer-Encoding: 7bit
>
>Steve,
>
>In case this helps with the diagnosing of this problem....
>
>Yesterday, sflist was the program that was bombing out - it would get
>hung and when I would do a ps on the process, teh computer time would
>keep increasing. I would eventually kill the process since it wasn't
>killing itself. Today it seems that whatever teh problem, it is
>effecting oabsfc. This is preventing many of our maps from finishing. I
>hope this hlep somehow. Again, if you would like the surface file for
>today, let me know and i will ftp it over.
>
>Thanks
>
>Maureen
>
>--
>========================================================================
>
>Maureen Moore Ballard                   address@hidden
>Staff Meteorologist                           ph:  606-257-3000ext244
>Ag. Weather Center                         fax: 606-257-5671
>243 Ag. Engineering Bldg
>Dept. of Biosystems and Ag. Engr.
>University of Kentucky
>Lexington, KY 40546-0276
>HOMEPAGE   http://wwwagwx.ca.uky.edu
>
>You don't stop laughing because you grow old;
>you grow old because you stop laughing.
>=========================================================================
>
>
>