[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 951211: gribtonc problems at FSL



>From: Bear Giles <address@hidden>
>Organization: NOAA/FSL
>Keywords: 199512111654.AA21481 gribtonc

Bear,

> >I can't tell what the problem is here.  If you just "FILE" the products
> >using the pqact FILE command and then run gribtonc on the result, does  that
> >work OK?
> 
> Yes; in fact I have two rules; one feeds directly into a script which
> starts with gribtonc, the other does a FILE to a different directory so
> I can resend the data later for testing.  If I run gribtonc on the files
> in the cache, it seems to work fine.
> 
> Also, I might have had a problem in my diagnostic software.  I'm now only
> seeing the very first item received being marked as missing, instead of
> all data in the first column being marked as missing.
> 
> >pqact is writing the products to a pipe that is gribtonc's stdin, and a
> >"Broken pipe" would seem to indicate that the gribtonc process pqact
> >started up died before it read any data from its stdin.  But if the
> >gribtonc log indicates the first product was decoded and written, that can't
> >be what's causing the "Broken pipe".
> 
> I'm beginning to suspect the problem is with the time required to 
> initialize the data in the new NetCDF file; over 3 MB is initialized as
> the first product is received.
> 
> My current model of the problem is:
> 
>     pqact                           gribtonc
>       -------------                                   ---------------
>     receive first grib packet
>       fork gribtonc                   started
>       receive next packet             fork ncgen
>       receive next packet             decode first header
>       receive next packet             write first variable with unlimited dim.
>       receive next packet             initialize netCDF variables...
>       receive next packet             initialize netCDF variables...
>       receive next packet             initialize netCDF variables...
>       pipe full                       initialize netCDF variables...
>                                       initialize netCDF variables...
>                                       write data
>     give up; "broken pipe"
>       fork gribtonc                   started
>       receive next packet             decode second packet
>       receive next packet             decode next packet
>       ....                            ....
> 
> Alternately, instead of the "broken pipe" being due to the pipe becoming
> full, perhaps enough other actions were triggered that pqact closed the
> pipe to gribtonc under some sort of LRU scheme.  Once again, the key would
> be the long time required to initialize the new data.

pqact is supposed to block when it tries to write to a pipe that is "full".
Maybe there is some sort of timeout or alarm on the write that I don't know
about.  If so, you could possibly set the timeout to be large enough to
accommodate the creation of the first record.

The first write is also writing out fill values for all the variables in the
first record.  If you knew everything would eventually get written, you could
turn off the writing of fill values with 

    ncsetfill(ncid, NC_NOFILL)

but this would mean you later couldn't later detect missing values easily.

Another possible way to speed up the first write is to use the latest
version of ncgen from the netCDF beta5 release.  It's significantly
smaller than the previous versions, so may load faster.  This is a long shot
though; it may not make enough difference to fix your problem.

--Russ