Black hole BUFR blues

For some reason I've decided to jump back into the black hole of BUFR tables.

It started when someone sent me a BUFR message from EUMETNET OPERA which uses a center of 255 (missing). Ive seen other groups use 255, including Brasil (INPE) and a US group who I forget the name. Anyway, the 255 means theres no way to know the correct local table for that message. EUMETNET documentation makes it clear this was a concious choice, not a mistake (thanks to Ernst de Vreede at KNMI for sending this along):
3.1   Value of originating center
In BUFR language, every «originating center» (in practice every NMS and some international bodies, e.g. ECMWF, are originating centers) has the right to define its own local set of descriptors.  A given number identifies such an originating center. OPERA is not recognized as such a center, and so has not his assigned number.

This is a problem if OPERA needs to define radar specific descriptors.  It is not possible to ask each OPERA member to add the OPERA descriptors to their local tables, as this leads to conflicts. For instance descriptor 3 21 192 is in OPERA a 4 bit reflectivity image, and in France it is used for calibration results.

The solution devised so far by OPERA was to piggyback on the missing value for the originating center field in the message. This value is 255 or 65535 depending on the BUFR edition used (see 3.3). Since summer 2008, WMO has allocated 247 as the official «originating center» for OPERA. It is not decided currently how OPERA members will handle the transition.
This reinforces my guess that BUFR providers' mind set is often about producing messages that their own software can consume. Apparently the problems of generic reading software are not always thought through.

I decided to review the state of BUFR table processing in the CDM. Its a mess. So I kept moving closer to the event horizen of the BUFR black hole and eventually was sucked in. If you relax, the crushing gravitational forces give you a massage that keeps the irritation at BUFR from being too overwhelming.

A number of facts emerge, like Hawking radiation, that i'm not sure the BUFR community fully appreciates. One is that the WMO tables that are used by various BUFR writing software packages have changed from version to version. I first noticed this 2 years ago, but aliens abducted me and removed my memory of it. So I kept looking for the correct, backwards compatible version. (no, not mel-bufr, no, not eumetsat, no, not ncep. hmmm)

The second fact is that centers routinely override standard WMO entries, even though these are marked "reserved". My guess is that this is done mostly when adding provisional entries, but in some cases it looks like a flagrant trespassing onto the WMO "keep off the lawn" lawn.

The consequences of that are that if you look at a BUFR message in the wild (that is, you dont know anything more about it except whats coded into the message itself), then you never know for sure what table was used, even when all the entries are in the WMO part of the table.

Another problem is that the table version encoded in the message is sometimes incorrect. Typically one sees a message using an entry from a newer version of the WMO table, and for some reason doesn't update the version number in the message. When I first saw this, thinking that WMO tables were backwards compatible, I thought I could just use the most recent WMO table. But if there are differences (and apparently there are) then you need to know the correct version.

When the WMO community moved to version 14, there was an explicit acknowledgment that there would be incompatible changes from version 13. But generally it was thought that up until version 13, tables were compatible. However there are small discrepencies in the tables as they were released. Perhaps typos? This problem was seriously compounded by releasing the tables only in Word and PDF, neither of which is machine readable. So humans around the world have been retyping or otherwise painfully extracting the table entries into some local version of the WMO tables that could be read by their programs. So that's been a mess.

As of version 14, and thanks to the work of Atsushi Shimazaki (and probably others that i'm not aware of) the WMO is now releasing machine readable tables (XML, CVS, txt). So at least that problem is going away.

Given that theres lots of possibly incompatible versions of WMO tables being used in different software stacks, what did the data producer actually use? One is reduced to finding the actual person in charge and asking them. That doesnt scale, and sometimes they don't know. But of more concern is that in 10 or 50 years, those people will be gone, and the actual knowledge will be lost. So good luck, future data miners, reliably reading archived BUFR data.

Jeff Ator at NCEP has correctly pointed out that one solution is to include the BUFR tables in the archive file that stores the BUFR data messages. One can encode BUFR tables as a BUFR message, by using a particular section of the, you guessed it, standard WMO BUFR tables. So as long as those encodings are done correctly, and when they change are correctly versioned, and all the software uses a correctly transcribed machine-readable copy, and no one overrides with a local table of how to encode BUFR tables, then our future data miners should have a good shot at correctly reading the data.

 So that's best practice, fellow data buriers, i mean archivers. Do it, or risk the scorn of your grandchildren. 


I think it is a good idea to include BUFR tables in archive.

One question. How can we assure right version of complete table is provided? Relying on data producer sounds a bit fragile for me, when we can't trust table version encoded in the BUFR messages.

I think this suggests a need of BUFR validator that checks all descriptors exists in the table.


Posted by TOYODA Eizi on September 19, 2011 at 10:57 AM MDT #

Good to know about "Black hole BUFR blues", valuable info..

Posted by James on November 24, 2011 at 06:42 AM MST #

Post a Comment:
Comments are closed for this entry.
Unidata Developer's Blog
A weblog about software development by Unidata developers*
Unidata Developer's Blog
A weblog about software development by Unidata developers*



News@Unidata blog

Recent Entries:
Take a poll!

What if we had an ongoing user poll in here?

Browse By Topic
Browse by Topic
« May 2019