[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

20040726: Please Help with gempak decoders



Adam,

I have found that the data stream now contains ocean grids being identified
as issued from center 161 (which likely is a mistake). The result is that
there is no entry in $GEMTBL/grid/cntrgrib1.tbl for 161, and therefore the
decoder is having trouble opening the correct$GEMTBL/grid/ncepgrib3.tbl file.

You may want to copy the center 7 entry in cntrgrib1.tbl to
161 for the time being.

This problem could be causing the ocean and/or other instances
of your invocation to core dump. I'm still looking into your RUC instance
noted below.

Steve CHiswell
Unidata User Support


>From: "Adam Taylor" <address@hidden>
>Organization: UCAR/Unidata
>Keywords: 200407261451.i6QEpoaW010324

>This is a multi-part message in MIME format.
>
>------=_NextPart_000_0022_01C472F5.F1C79780
>Content-Type: text/plain;
>       charset="us-ascii"
>Content-Transfer-Encoding: 7bit
>
>I have a problem with the gempak decoders.
>
> 
>
>On using the advice given to me by Steve, I have split up my pqact
>processing into multiple instances.  This cured a lot of my core dumping
>problems (I only had 1 pqact doing everything before)
>
> 
>
>Now, onto the problem that no matter what I do it core dumps on me.
>
> 
>
>Dcgrib2  won't stop dumping on three types of data sets.  I first split the
>dcgrib2 decoders sections into 2 parts to try and balance the load.  This
>seemed to do nothing as I was still getting core dumps from dcgrib2.  I then
>decided to test something.  I split the ETA, NGM, AVN, and RUC in to
>complete separate pqact calls.  Basically one pqact for one model.
>
> 
>
>When it was only split into two parts I was ALWAYS getting core dumps on
>three data sets.  These were logged as:
>
> 
>
>Dcgrib2_NWSother
>
>Dcgrib2_ocean
>
>Dcgrib2_RUC
>
> 
>
>These are the ONLY three data sets that seem to ever core dump.  For the
>past week now I have been checking every core and these are the only ones
>that have shown up.  Steve told me that this could be caused by two decoders
>trying to write to the same file.  He said if the delay was too high then it
>would spawn another decoder process and the Sig-11 would probably happen.
>This is when I decided to try splitting the main bulk of the models into
>there own processing pqact.
>
> 
>
>Again the same problem occoured.  The NWSother and ocean are in the same
>pqact file but the RUC has it's own pqact dedicated to it.  The RUC by
>itself still core dumped on me.  According to the RUC log it has processed
>lots of bulletins so I know its doing something before it crashes.  
>
> 
>
>Now correct me if I am wrong but if when the ETA,AVN, RUC, and NGM models
>were in one processing group, if the delay was high for the pqact then I
>should have atleast gotten some error for at least one other model in the
>last week, correct??  That is what puzzles me.  Even when clumped together
>on one pqact, only the RUC seems to core dump on me.  I have never gotten a
>core dump from ETA, NGM, or AVN though.  The EXACT same holds true for the
>NWSother and ocean ones.
>
> 
>
> 
>
>Now, about the system (This may be the problem but I'm not sure how)
>
> 
>
>Dual 933Mhz P3 w/1Gig Rambus 800Mhz RAM
>
>2 18Gig UltraWIDE SCSI 160 10K RPM drives
>
>1 72Gig UltraWIDE SCSI 160 10K RPM drive (main data drive)
>
>OS:  Fedora Core 2
>
> 
>
>My I/O wait times when looking at top never seem to get much past 3-4% and
>when they do get that high they immediately drop back down to around 0.2% or
>so on the next screen refresh.  As far as I know that should be acceptable.
>
>
> 
>
>I have been trying to solve this problem for the last couple of weeks and I
>have looked at all I know to look at and still can't figure out why these
>three data sets seem to crash out dcgrib2.
>
> 
>
>Is there anyway, that someone could please look at the system
>(cyclone.geos.ulm.edu) and tell me what in the world is going on.  I have
>just plain run out of things that I know to look at to see what might be the
>problem.
>
> 
>
>Thanks
>
> 
>
>Adam Taylor
>
>Computing Center
>
>University of Louisiana at Monroe
>
> 
>
>Ps.  You guys should hopefully still have the password to that machine.  If
>not I will email that directly to someone.
>
>
>------=_NextPart_000_0022_01C472F5.F1C79780
>Content-Type: text/html;
>       charset="us-ascii"
>Content-Transfer-Encoding: quoted-printable
>
><html xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
>xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
>xmlns:st1=3D"urn:schemas-microsoft-com:office:smarttags" =
>xmlns=3D"http://www.w3.org/TR/REC-html40";>
>
><head>
><meta http-equiv=3DContent-Type content=3D"text/html; =
>charset=3Dus-ascii">
><meta name=3DGenerator content=3D"Microsoft Word 11 (filtered medium)">
><o:SmartTagType =
>namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
> name=3D"PlaceName"/>
><o:SmartTagType =
>namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
> name=3D"PlaceType"/>
><o:SmartTagType =
>namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
> name=3D"City"/>
><o:SmartTagType =
>namespaceuri=3D"urn:schemas-microsoft-com:office:smarttags"
> name=3D"place"/>
><!--[if !mso]>
><style>
>st1\:*{behavior:url(#default#ieooui) }
></style>
><![endif]-->
><style>
><!--
> /* Style Definitions */
> p.MsoNormal, li.MsoNormal, div.MsoNormal
>       {margin:0in;
>       margin-bottom:.0001pt;
>       font-size:12.0pt;
>       font-family:"Times New Roman";}
>a:link, span.MsoHyperlink
>       {color:blue;
>       text-decoration:underline;}
>a:visited, span.MsoHyperlinkFollowed
>       {color:purple;
>       text-decoration:underline;}
>span.EmailStyle17
>       {mso-style-type:personal-compose;
>       font-family:Arial;
>       color:windowtext;}
>@page Section1
>       {size:8.5in 11.0in;
>       margin:1.0in 1.25in 1.0in 1.25in;}
>div.Section1
>       {page:Section1;}
>-->
></style>
>
></head>
>
><body lang=3DEN-US link=3Dblue vlink=3Dpurple>
>
><div class=3DSection1>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>I have a problem with the gempak =
>decoders.<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>On using the advice given to me by Steve, I have =
>split up my
>pqact processing into multiple instances.&nbsp; This cured a lot of my =
>core
>dumping problems (I only had 1 pqact doing everything =
>before)<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Now, onto the problem that no matter what I do it =
>core dumps
>on me.<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Dcgrib2 &nbsp;won&#8217;t stop dumping on three types =
>of
>data sets.&nbsp; I first split the dcgrib2 decoders sections into 2 =
>parts to
>try and balance the load.&nbsp; This seemed to do nothing as I was still
>getting core dumps from dcgrib2.&nbsp; I then decided to test =
>something.&nbsp;
>I split the ETA, NGM, AVN, and RUC in to complete separate pqact =
>calls.&nbsp;
>Basically one pqact for one model.<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>When it was only split into two parts I was ALWAYS =
>getting
>core dumps on three data sets.&nbsp; These were logged =
>as:<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Dcgrib2_NWSother<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Dcgrib2_ocean<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Dcgrib2_RUC<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>These are the ONLY three data sets that seem to ever =
>core
>dump.&nbsp; For the past week now I have been checking every core and =
>these are
>the only ones that have shown up.&nbsp; Steve told me that this could be =
>caused
>by two decoders trying to write to the same file.&nbsp; He said if the =
>delay
>was too high then it would spawn another decoder process and the Sig-11 =
>would
>probably happen.&nbsp; This is when I decided to try splitting the main =
>bulk of
>the models into there own processing pqact.<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Again the same problem occoured.&nbsp; The NWSother =
>and ocean
>are in the same pqact file but the RUC has it&#8217;s own pqact =
>dedicated to
>it.&nbsp; The RUC by itself still core dumped on me.&nbsp; According to =
>the RUC
>log it has processed lots of bulletins so I know its doing something =
>before it
>crashes.&nbsp; <o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Now correct me if I am wrong but if when the ETA,AVN, =
>RUC,
>and NGM models were in one processing group, if the delay was high for =
>the
>pqact then I should have atleast gotten some error for at least one =
>other model
>in the last week, correct??&nbsp; That is what puzzles me.&nbsp; Even =
>when
>clumped together on one pqact, only the RUC seems to core dump on =
>me.&nbsp; I
>have never gotten a core dump from ETA, NGM, or AVN though.&nbsp; The =
>EXACT
>same holds true for the NWSother and ocean =
>ones.<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Now, about the system (This may be the problem but =
>I&#8217;m
>not sure how)<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Dual 933Mhz P3 w/1Gig Rambus 800Mhz =
>RAM<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>2 18Gig UltraWIDE SCSI 160 10K RPM =
>drives<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>1 72Gig UltraWIDE SCSI 160 10K RPM drive (main data =
>drive)<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>OS:&nbsp; Fedora Core 2<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>My I/O wait times when looking at top never seem to =
>get much
>past 3-4% and when they do get that high they immediately drop back down =
>to
>around 0.2% or so on the next screen refresh.&nbsp; As far as I know =
>that
>should be acceptable.&nbsp; <o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>I have been trying to solve this problem for the last =
>couple
>of weeks and I have looked at all I know to look at and still =
>can&#8217;t
>figure out why these three data sets seem to crash out =
>dcgrib2.<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Is there anyway, that someone could please look at =
>the
>system (cyclone.geos.ulm.edu) and tell me what in the world is going =
>on.&nbsp;
>I have just plain run out of things that I know to look at to see what =
>might be
>the problem.<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Thanks<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'><o:p>&nbsp;</o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Adam Taylor<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Computing Center<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><st1:PlaceType w:st=3D"on"><font size=3D2 =
>face=3DArial><span
> =
>style=3D'font-size:10.0pt;font-family:Arial'>University</span></font></st=
>1:PlaceType><font
>size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;font-family:Arial'> of <st1:PlaceName
>w:st=3D"on">Louisiana</st1:PlaceName> at <st1:City =
>w:st=3D"on"><st1:place =
>w:st=3D"on">Monroe</st1:place></st1:City><o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>&nbsp;<o:p></o:p></span></font></p>
>
><p class=3DMsoNormal><font size=3D2 face=3DArial><span =
>style=3D'font-size:10.0pt;
>font-family:Arial'>Ps.&nbsp; You guys should hopefully still have the =
>password
>to that machine.&nbsp; If not I will email that directly to =
>someone.<o:p></o:p></span></font></p>
>
></div>
>
></body>
>
></html>
>
>------=_NextPart_000_0022_01C472F5.F1C79780--
>
--
NOTE: All email exchanges with Unidata User Support are recorded in the
Unidata inquiry tracking system and then made publically available
through the web.  If you do not want to have your interactions made
available in this way, you must let us know in each email you send to us.