Re: [bufrtables] Summary report on the suitability of GRIB/BUFR for archiving data

Hi Jeff:

Some comments are embedded.

On 3/29/2011 6:50 AM, Jeff Ator wrote:
Hello Everyone,

To expand on some previous points...

   * The WMO does maintain machine-readable versions of the tables, for
     both BUFR and GRIB, at
     http://www.wmo.int/pages/prog/www/WMOCodes/TDCFtables.html  Thanks
     to Paolo for pointing this out in his earlier email.  Different
     versions of the tables can be downloaded from this site.  For BUFR
     at least, table version 13 is a superset of all previous versions,
     so it can be used to decode BUFR messages from all previous
     versions.  This is no longer true beginning with version 14, where
     it's possible that some deprecated items have now been removed or
     that some descriptor characteristics have been modified from
     previous versions of the table.  So for BUFR, decoding centers
     should maintain copies of tables 13, 14 and onward in whatever
     format(s) are required by their local processing software.

It is important that the WMO maintain canonical copies of versions going forward. Also, one should realize that there may be some errors in these tables, these will get fixed as they are noticed and corrected or clarified. Eventually we will have tables that dont have errors, but then new entries are added and the process starts again. So both encoding and decoding centers will need to keep up with the latest canonical tables. One cant download them once and keep using them for 10 years.


   * John rightly points out that the proper use of this table version
     number by message originators would eliminate the problem outlined
     in his paper.  In my experience, this problem stems mostly from a
     casual attitude by originators in ensuring they've used the proper
     version number in their messages.  Many originators use software
     where this number is often hardcoded and so becomes, at best, an
     afterthought.  This is an education issue that WMO is working hard
     to address among its members.

agree

   * There's also a concerted effort among members to develop BUFR
     templates for certain types of commonly-reported data such as
     SYNOP, BUOY, TEMP/PILOT, CLIMAT, etc.  This is a by-product of
     WMO's ongoing migration from these old alphanumeric fixed-field
     formats to BUFR.  The list of templates is available at
http://www.wmo.int/pages/prog/www/WMOCodes/TemplateExamples.html#Regulations
     , and while their use isn't mandatory, it does make things a lot
     simpler for downstream codes which have to interpret the decoded
     output, and which is another point that John made in his paper.

this is helpful.

one doesnt know for sure that the bufr message is using a template, i think, without actually comparing the message to the template?

   * If anyone has a requirement for a new BUFR or GRIB2 descriptor
     which they feel would be reasonable to propose as a new official
     WMO descriptor (vs. just using their own local descriptor number),
     please let me know.  I represent the U.S. to the WMO codes group
     which reviews and approves these types of requests.  Depending on
     the nature of the request, there are fast-track procedures
     available which can lead to formal approval within a matter of 2-3
     months.
   * As originally envisioned, BUFR and GRIB2 weren't designed to be
     formats for archive storage, but rather for efficient real-time
     exchange of meteorological data.  Nevertheless, this doesn't mean
     they can't be used as archive formats.  We do this here at NCEP,
     and the approach we use involves storing a copy of the applicable
     table with each archived dataset.  Note that, for BUFR at least,
     table information can be encoded into BUFR messages using
     descriptors from Class 0 of Table B.  When this is done, the
     necessary table information can be easily retained alongside the
     data in a very compact and efficient manner, using one or two
     additional BUFR messages at the head of each archived file.  Such
     an approach could even be used when exchanging real-time data sets
     between centers, at the cost of one or two additional BUFR
     messages.  This would eliminate the problem of receiving centers
     having to "guess" whether the table version number in each data
     message was encoded properly.

We arent yet dealing completely with this way of storing the tables in bufr messages, but it certainly solves the problem, as long as those messages are stored in the same file as the messages that they refer to. Assuming this works, I would modify my conclusion that "BUFR/GRIB is not suitable as long-term storage formats" to add "unless the tables are stored in the file, etc".

   * In my opinion, when everyone follows the rules (e.g. using
     official descriptors with proper table version numbers), the
     process works very well.  The trick of course is to get everyone
     (and their software) to pay attention to the rules.  But this is
     true of any format and is not unique to BUFR and GRIB.

Of course, errors are possible no matter what, but this particular problem is specific to table-driven formats like GRIB and especially BUFR. It doesnt occur, for example, in netCDF when using CF conventions. However, its true that one ultimately refers to human-readable documents for semantics.


With best regards,
-Jeff

Thanks for your thoughts!

Regards,
John