[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

19990519: Puzzled by pqact.conf




William,

The main problem I noticed is your pqact.conf entry
> NMC2  \.status\..*nggrib/ruc2a.(......)/ruc2.T(..)Z
>       FILE    data/ruc2/status/19\1\2.grib
>       EXEC    bin/run_date.pl 199905XX
>       EXEC    touch data/rucdate


The problem above is that each pattern line relats to exactly 1 action
line, so instead you need:
NMC2    \.status\..*nggrib/ruc2a.(......)/ruc2.T(..)Z
        FILE    data/ruc2/status/19\1\2.grib
NMC2    \.status\..*nggrib/ruc2a.(......)/ruc2.T(..)Z
        EXEC    bin/run_date.pl 199905XX
NMC2    \.status\..*nggrib/ruc2a.(......)/ruc2.T(..)Z
        EXEC    touch data/rucdate


The 45 minute delay bewteen the last .status touch and the
timestamp on the data file is much larger than I see here-
today I have a 3 minute gap at 12Z- I'll recheck this tonight at 00Z.

There is a way to force the flushing and closing of the file
from the FILE action with "FILE <tab> -close <tab> filename.
In general, we have so much data always coming in that the 
LDM is constantly flushing the buffers. At the most, the LDM
should flush its write to the FILE within 10 minutes of
not seeing any data (the data is buffered so as not to clobber
your disk unecessarily).

Let me log the T00Z filing times tonight to see if I can
come up with a better answer for you.

Steve



On Wed, 19 May 1999, William Smith wrote:

> Steve,
> 
> I'm having trouble understanding what pqact.conf is doing to me.
> I'm getting the AVN 00Z data OK, and I'm getting some RUC2 data
> to check things out.  The behavior I'm seeing baffles me.  
> 
> I've tried ldmadmin stop...start and I've tried kill -HUP <pqacd_id>
> to verify that the latest pqact.conf file is active.
> 
> This seems to work for the AVN data.  The 00Z data comes in and is 
> stored as one big file in data/avn/1999051900.grib.  The touch seems
> to work for the .status. record because the time is update on the
> file data/currentdate.  
> 
> NMC2  avn/avn.(......)/gblav.T00Z
>       FILE    data/avn/19\100.grib
> NMC2  \.status\..*avn/avn.(......)/gblav.T00Z
>       EXEC    touch data/currentdate
> 
> -rw-rw-r--    1 ldm      wx               0 May 19 04:43 data/currentdate
> 
> data/avn:
> -rw-r--r--    1 ldm      wx               0 May 19 02:00 .scour*
> -rw-rw-r--    1 ldm      wx       311917593 May 19 05:30 1999051900.grib
> 
> The touch seems to have occured 45 minutes before the last write to
> the avn product.  The last two days the time on the avn data has been
> exactly 05:30.  This makes me think something else is writing to the
> file after the data has been received.  No major problem here, but
> puzzling nonetheless.
> 
> I'd like to change the touch to something else.  That's what I've been
> trying with the RUC2 data.
> 
> 
> # Test with RUC data that comes more frequently
> NMC2  \.status\..*nggrib/ruc2a.(......)/ruc2.T(..)Z
>       FILE    data/ruc2/status/19\1\2.grib
>       EXEC    bin/run_date.pl 199905XX
>       EXEC    touch data/rucdate
> NMC2  nggrib/ruc2a.(......)/ruc2.T(..)Z.*/F003/VVEL/
>       FILE    data/ruc2/19\1\2.grib
> 
> 
> This gets really confusing.
> 
> 1.  The F003/VVEL data is put in files in the data/ruc2 directory. 
> When I did this the first time the ruc2 subdirectory didn't exist and
> pqact made the subdirectory and put the grib file in there.  So far, OK.
> 
> 2.  The .status. files are NOT being put in the data/ruc2/status directory. 
> In fact the subdirectory named status has never been created. The touch
> to data/rucdate may be working, but instead of a touch there are completion
> messages being stored in the file.
> 
> -rw-rw-r--    1 ldm      wx           11734 May 19 17:54 data/rucdate
> 
> They look like this:
> 
> /u/ftp/gateway/ncepe/nggrib/ruc2a.990519/ruc2.T17Z.bgrbf02 complete (9984660 
> bytes) at Wed May 19 13:44:41 1999
> /u/ftp/gateway/ncepe/nggrib/ruc2a.990519/ruc2.T17Z.bgrbf03 complete (10296052 
> bytes) at Wed May 19 13:45:19 1999
> /u/ftp/gateway/ncepe/nggrib/ruc2a.990519/ruc2.T18Z.bgrbanl complete (9732998 
> bytes) at Wed May 19 14:39:38 1999
> 
> I don't think scour is cleaning them out since I didn't put that directory
> in the scour.conf.  There are now 105 entries in rucdate.  This was just
> supposed to be touched, not written to.
> 
> 
> 3.  I wrote a perl script named run_date.pl and put it in the bin
> directory.  It takes the first argument passed to it and puts it in
> the data/currenttest file.  I can run that script from the command
> line while in the ldm home directory and it works as described.  It
> has never succeeded from the EXEC entry in pqact.conf.  The entries do
> use <TAB> as whitespace.
> 
> To test it further I added a `touch data/newtest` command to the perl
> script, and it has never been executed.  I just moved that command to be
> the first one executed in the script in case the script it aborting.  I'll
> check it the next time RUC2 gives me a .status. record.
> 
> I originally passed 19\1\2 to the run_date.pl program, and changed that
> to a constant 199905XX, which still didn't get written to currenttest.
> 
> 
> 
> William
> 
> _____________________________________________________________________________
> William Smith                             address@hidden
> Maui High Performance Computing Center    (808) 879-5077  
> 550 Lipoa Parkway                         (808) 879-5018 (fax)
> Kihei, HI  96753                          WWW: http://www.mhpcc.edu
> _____________________________________________________________________________
> 
>