pqact.conf


Contents


Introduction and General Syntax

A pqact configuration-file (typically pqact.conf) tells the pqact process that reads it how to dispose of certain classes of data-products. This file normally resides in the etc/ subdirectory of the LDM installation.

The general syntax of an entry in the configuration-file is:

feedtype TAB prodIdPat TAB action TAB [arg ...]
where:
feedtype
A feedtype (e.g., WMO, IDS|DDPLUS, 3).
prodIdPat
An ERE for matching data-product identifiers or the string ^_ELSE_$. If prodIdPat is ^_ELSE_$, then the specified action is performed if nothing has been done with the data-product yet and the first character of the data-product identifier is not an underscore (_).
action
The action to take with data-products that match feedtype and prodIdPat. Possible actions are
NOOP
Don't do anything with the data-product. This might be useful to prevent data-products from being acted-upon by a subsequent entry whose product-ID pattern is ^_ELSE_$.
FILE
Write the data-product to a file using the write() function.
STDIOFILE
Write the data-product to a file using the (buffered) fwrite() function.
DBFILE
Write the data-product to a database.
EXEC
Execute a program.
PIPE
Write the data-product to a program's standard input.
[arg ...]
Optional arguments for action. See String Substitution in Action-Arguments below.
TAB
Is either a tab character or a newline character followed by a tab character.
Comments have a hash character (#) in column one.

String Substitution in Action-Arguments

In constructing arguments for an action, certain character sequences have special meaning. These sequences serve as templates for replacement strings derived from either the combination of the prodIdPat and the data-product identifier, or from the data-product creation-time.

Replacment-Strings Derived from the prodIdPat and the Data-Product Identifier

Simple Subexpression Replacement

The character sequence \x in the argument field is replaced by the substring of the data-product identifier that matches the corresponding subexpression of prodIdPat. x is either a single digit greater than 0 and less than or equal to 9 (e.g., \3) or two digits surrounded by parentheses (e.g., \(12)). Thus, for example, the entry

DDS	^SAUS.. .... (..)(..)
	FILE	saus_\1\2.wmo
would append all products with a DDS feedtype that have data-product identifiers (in this case WMO headers) beginning with the characters "SAUS" to hourly files named "saus_ddhh.wmo", where dd and hh are the two-digit day and hour from the data-product identifier, respectively.

ASIDE: Information on the format of WMO headers can be found at http://www.nws.noaa.gov/tg/table.html

Temporal Subexpression Replacement

If a parenthetical subexpression of prodIdPat delimits the day of the month field in the data-product identifier, then that subexpression can be used to obtain certain temporal strings in the argument field. If \n is the matching subexpression, then the following character sequences in the argument field are replaced with the indicated strings (the parentheses are mandatory):

(\x:yyyy)
is replaced with the 4-digit year.
(\x:yy)
is replaced with the 2-digit year of the century.
(\x:mmm)
is replaced with the 3-character abbreviation for the month.
(\x:mm)
is replaced with the 2-digit index of the month (Jan = 1).
(\x:ddd)
is replaced with the 3-digit day of the year.
(\x:dd)
is replaced with the 2-digit day of the month. This will differ from the original \x string if, for example, the day-of-the-month field is 31 but the data-product arrives on September 30th.
where x is as described under "Simple Subexpression Replacement", above (i.e., either a single digit or two digits surrounded by parentheses).

The interpretation of the day of the month subexpression is aided by the current clock-time in the (hopefully) obvious way.

Thus, for example, the following entry

WMO	^...... .... ([0-3][0-9])([0-2][0-9]).*/pAGO
	FILE	data/gempak/nwx/obs/ago/(\1:yyyy)(\1:mm)\1\2.ago
would append matching data-products to files whose pathnames were based on the year, month, day, and hour of the data-product as indicated or implied by the data-product identifier.

Replacment-Strings Derived from the Data-Product Creation-Time

The following characters sequences in the argument field are replaced with the indicated strings based on the data-product creation-time:

%a
is replaced by the locale's abbreviated weekday name.
%A
is replaced by the locale's full weekday name.
%b
is replaced by the locale's abbreviated month name.
%B
is replaced by the locale's full month name.
%c
is replaced by the locale's appropriate date and time representation.
%C
is replaced by the century number (the year divided by 100 and truncated to an integer) as a decimal number [00-99].
%d
is replaced by the day of the month as a decimal number [01,31].
%D
same as %m/%d/%y.
%e
is replaced by the day of the month as a decimal number [1,31]; a single digit is preceded by a space.
%h
same as %b.
%H
is replaced by the hour (24-hour clock) as a decimal number [00,23].
%I
is replaced by the hour (12-hour clock) as a decimal number [01,12].
%j
is replaced by the day of the year as a decimal number [001,366].
%m
is replaced by the month as a decimal number [01,12].
%M
is replaced by the minute as a decimal number [00,59].
%n
is replaced by a newline character.
%p
is replaced by the locale's equivalent of either a.m. or p.m.
%r
is replaced by the time in a.m. and p.m. notation; in the POSIX locale this is equivalent to %I:%M:%S %p.
%R
is replaced by the time in 24 hour notation (%H:%M).
%S
is replaced by the second as a decimal number [00,61].
%t
is replaced by a tab character.
%T
is replaced by the time (%H:%M:%S).
%u
is replaced by the weekday as a decimal number [1,7], with 1 representing Monday.
%U
is replaced by the week number of the year (Sunday as the first day of the week) as a decimal number [00,53].
%V
is replaced by the week number of the year (Monday as the first day of the week) as a decimal number [01,53]. If the week containing 1 January has four or more days in the new year, then it is considered week 1. Otherwise, it is the last week of the previous year, and the next week is week 1.
%w
is replaced by the weekday as a decimal number [0,6], with 0 representing Sunday.
%W
is replaced by the week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0.
%x
is replaced by the locale's appropriate date representation.
%X
is replaced by the locale's appropriate time representation.
%y
is replaced by the year without century as a decimal number [00,99].
%Y
is replaced by the year with century as a decimal number.
%Z
is replaced by the timezone name or abbreviation, or by no bytes if no timezone information exists.
%%
is replaced by %.

NOOP Action

The NOOP action tells the pqact process to do nothing to the data-product. This might be useful to prevent data-products from being acted-upon by a subsequent entry whose product-ID pattern is ^_ELSE_$.


FILE Action

The FILE action tells the pqact process to write the data-product to a file using the (unbuffered) write() function.

The syntax of a FILE action is

FILE TAB [-overwrite] [-flush|-close] [-strip] [-log] [-metadata] pathname
where:
-overwrite
Causes the file to be completely rewritten every time it is opened; consequently, you should probably always use the -close option in conjunction with this option.
-flush
Causes the fsync() function to be called after a data-product is written.
-close
Causes the file to be closed after a data-product is written. The default is to keep the file open.
-strip
Causes control characters other than newline (see iscntrl()) to be removed from the data-product before it is written to the file.
-log
Causes the pqact process to log the fact that it filed the data-product.
-metadata
Causes the metadata of the data-product to be written to the file before any data. The metadata is written in the following order using the indicated binary data-types of the C language:
  • Metadata-length in bytes (uint32_t)
  • Data-product signature (MD5 checksum) (uchar[16])
  • Data-product size in bytes (uint32_t)
  • Product creation-time in seconds since the epoch:
    • Integer portion (uint64_t)
    • Microseconds portion (int32_t)
  • Data-product feedtype (uint32_t)
  • Data-product sequence number (uint32_t)
  • Product-identifier:
    • Length in bytes (excluding NUL) (uint32_t)
    • Non-NUL-terminated string (char[])
  • product-origin:
    • Length in bytes (excluding NUL) (uint32_t)
    • Non-NUL-terminated string (char[])
The endianness of the multi-byte primitive types is that of the local host.
pathname
Is the pathname of the file to which the data-product will be written.

STDIOFILE Action

The STDIOFILE action tells the pqact process to write the data-product to a file using the (buffered) fwrite() function. In general, this is more efficient than the FILE action but risks loosing data if the computer crashes.

The syntax of an STDIOFILE action is

STDIOFILE TAB [-overwrite] [-flush|-close] [-strip] [-log] pathname
where the options and argument are the same as for the FILE action, except that the -flush option calls the fflush() function.

DBFILE Action

The DBFILE action tells the pqact process to store the data-product in a gdbm database.

The syntax of a DBFILE action is

DBFILE TAB pathname [key]
where:
pathname
Is the pathname of the gdbm database into which the data-product will be put.
[key]
Is the optional key under which to put the data-product.

EXEC Action

The EXEC action tells the pqact process to execute a program as a child process.

The syntax of a EXEC action is

EXEC TAB [-wait] pathname [arg ...]
where:
-wait
Causes the pqact process to suspend itself until the child-process has terminated before continuing. This should only be done if it is known that the child-process will terminate quickly.
pathname
Is the pathname of the program to be executed.
[arg ...]
Are optional arguments to pathname.

PIPE Action

The PIPE action tells the pqact process to execute a program as a child process and to write the data-product to the standard input of the child process.

The syntax of a PIPE action is

PIPE TAB [-strip] [-flush|-close] [-metadata] pathname [arg ...]
where:
-strip
Causes control characters other than newline (see iscntrl()) to be removed from the data-product before it is written to the pipe.
-flush
Causes the pqact process to flush its internal buffer to the pipe at the end of each data-product.
-close
Causes the pqact process to close the pipe to the child process after writing the data-product. The default is to keep the pipe open.
-metadata
Causes the metadata of the data-product to be written to the file before any data. The metadata is written in the following order using the indicated binary data-types of the C language:
  • Metadata-length in bytes (uint32_t)
  • Data-product signature (MD5 checksum) (uchar[16])
  • Data-product size in bytes (uint32_t)
  • Product creation-time in seconds since the epoch:
    • Integer portion (uint64_t)
    • Microseconds portion (int32_t)
  • Data-product feedtype (uint32_t)
  • Data-product sequence number (uint32_t)
  • Product-identifier:
    • Length in bytes (excluding NUL) (uint32_t)
    • Non-NUL-terminated string (char[])
  • product-origin:
    • Length in bytes (excluding NUL) (uint32_t)
    • Non-NUL-terminated string (char[])
The endianness of the multi-byte primitive types is that of the local host.
pathname
Is the pathname of the program to be executed.
[arg ...]
Are optional arguments to pathname.

The program pathname should be written so that it times-out and terminates after some interval (e.g., ten minutes).


Checking Modifications

Modifications to a configuration-file should be checked for correct syntax before being made operational. The syntax of all pqact configuration-files that are associated with active EXEC pqact entries in the LDM configuration-file can be checked via the command

ldmadmin pqactcheck
Otherwise, the syntax of the single pqact configuration-file, pathname, can be checked via the command
ldmadmin pqactcheck -p pathname

Limit on the Number of Open Output-Files

The pqact utility has a limit on the number of open output-files. A output-file is opened for every unique instance of the following actions:

FILE
STDIOFILE
DBFILE
PIPE
when taken together with their action-arguments after string substitution. Thus, for example, the (somewhat contrived) entry
ANY	(.*)	FILE	\1
would cause an output-file to be opened for every unique data-product identifier that the pqact process encountered!

The limit on the number of open output-files is equal to the output of the command

getconf OPEN_MAX | awk '{print $1-3}'

This limit has ramifications for decoding data. When the pqact program wants to open a new output-file after having reached its limit, it first closes the least recently used output-file. If the action associated with that output-file is PIPE, then the decoder process reading from the pipe will encounter an end-of-file condition and will terminate. As a consequence, decoders must be written so that they can "start-up" in the middle of decoding a product. If this is not possible, then more than one pqact(1) utility can be started (each with its own configuration-file, or course) and the work shared between them so that no one pqact(1) utility reaches the limit.