BUFR IOSP Design

BUFR Data

BUFR data is mostly point/station type of data. The NWS uses BUFR as it's internal format, so many of the datasets at some time have been in BUFR before it has reached it's final format. Many other organizations use BUFR as the final format when converting an adhoc data scheme for the data centers. My aim was to write a generic BUFR IOSP to handle these different variations of BUFR datasets.

BUFR Format

BUFR format in a nutshell is a unrestricted data packing scheme where the packing criteria is included with the data. So, two identical datasets could have different packing schemes depending on the internal packing information. This free for all scheme makes BUFR the most flexible packing scheme but also the most complex to write a generic decoder.

The variables are table entries that consist of:

F; X; Y; SCALE; REFERENCE; WIDTH; UNITS; DESCRIPTION

One example table B3M-000-012-B

Another variation of packing, instead of having the observations is sequential order, the fields values are all grouped together for all the observations. So to get the 5th observation, the whole record has to be unpacked and the 4th value of each field must be extracted to get the 5th observation. Also the number of bits to pack a value can change in the middle of decoding, ie 11 -> 6 bits. There are other options to change the data packing scheme, one can now understand why the BUFR format is a widely used packing scheme.

Because of these complexities, most NWS departments have a separate decoder for their own data because of the diversity in the packing schemes. Some of the datasets are complex because some of the fields are nested structures that required nested netCDF object structures also. The bottom line, BUFR data is hard to handle in a generic sense but that doesn't mean that a BUFR IOSP could not be written.

Creating a BUFR reader

It's necessary to have a data reader to support your IOSPs. On complex formats, it's worth your time to build a standalone package to test your datasets verses trying to test the data values inside the IOSP. For BUFR, there is a Java standalone decoder that has 4 main routines:

Sample command line calls

The isValidFile, BufrIndexer, and the BufrGetData routines are used for the API between the BUFR IOSP and the BUFR reader.

BUFR IOSP

The IOSPs are called when nj22 wants to open a file. The library checks what type file it is by routine, isValidFile that is implemented by all IOSPs. If the check returns true then that ISOP is called to open the file, the result is a netCDF object. When the user request data, the IOSP data routines are called to return the data.

Let's look at the Java code BufrIosp.java

Code to make a netCDF object Index2NC.java

ToolsUI webstart

BUFR netCDFHeader

Conclusion

Creating IOSPs makes the CDM library flexible for adding new types of data. If a data provider creates a standalone library for their dataset with the APIs isValidFile, open, and readData then the IOSP should not be hard to create.