Re: [netcdf-java] GRIB variable name changes in 4.3

To: Don.Murray@xxxxxxxx
Subject: Re: [netcdf-java] GRIB variable name changes in 4.3
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Mon, 27 Feb 2012 16:51:28 -0700

Hi Don:

On 2/27/2012 3:43 PM, Don Murray wrote:

Hi John and Ethan-
As I have discussed with you at length privately, I am not in favor ofthis change. This will break every IDV bundle that points to GRIBdata in a local file or on a TDS server. This will also affect usersof the TDS on the NCDC NOMADS servers who access data either throughscripts or the IDV. It's not a simple matter of users just pickingnew names and resaving the bundles when the bundles are stored onremote servers or used in a classroom setting.

I realize its a deep problem for the IDV, but its also an opportunity tofigure out how to gracefully evolve bundles when things change, whichthey do.

Below, for the benefit of the list, are my arguments for using thehuman readable variable names in the previous netCDF-Java 4.3 betarelease:
<quote>
I believe keeping the human readable variable names (as in theprevious 4.3 release - with slight modifications) is much preferableand backward compatible. I understand your reasons for wanting tochange, but while that makes the programmer's life easier, it makesthe user's (and other programmers') life harder.

In the long-term, if we get the fundamentals right, everyone's life gets easier.

For example, from a user perspective, with your changes, I'm going tohave to modify 50 or more bundles that are on my local machines(including the NOAA viz wall) or stored on RAMADDA servers which willtake several days. I'm also going to have to modify the customizationsto my IDV parameter tables that I've made over the past 7 years.
From a programmer's perspective, here are the impacts of your changesto the IDV:
 - bundles which use the variable name for lookup
 - data aliases used for derived quantities
- parameter aliases used for automatically assigning color tables,contour intervals and units- User guide and workshop documentation and examples will need to beupdated
For the past 7 or so years, IDV users have been able to accessrealtime GRIB datasets and have had stability in using andinterchanging those datasets. For example, I have a bundle:
http://motherlode.ucar.edu/repository/entry/get/GFS%2080%20km.xidv?entryid=9f77ca66-2264-4f8b-a460-e02fb42606ea
which has displays of 500 hPa geopotential heights, sea level pressureand precipition from the GFS 80km data. These are simple, commonlyused parameters. The IDV has a DataAlias table that equates thevariable name Geopotential_height with a canonical name of HGT whichis used to present derived quantities to the user of thickness andgeostrophic wind. It also uses this name to assign a color table,unit and contour levels for any display created for the variableGeopotential height. Same idea goes for Pressure_reduced_to_MSL andTotal_precipitation. It doesn't matter whether I go to the GFS 80 km(grib1) or the GFS .5 degree global (grib2), or even a NAM 80kmdataset. I can apply the bundle and use the same information to getthe same type of display.
Under the scheme in the previous version of 4.3beta,Geopotential_height will change to Geopotential_height_Pressure,Pressure_reduced_to_MSL will change to Pressure_reduced_to_MSL_Msl andTotal_precipitation will change to one of:
Total_precipitation_Surface_12_Hour_Accumulation
Total_precipitation_Surface_1_Hour_Accumulation
Total_precipitation_Surface_3_Hour_Accumulation
Total_precipitation_Surface_6_Hour_Accumulation
Total_precipitation_Surface_Mixed_intervals_Accumulation
From the IDV perspective, the DataAlias and ParameterDefaults usepatterns and case insensitive, so this should not be a problem becausethe old names would match into the new names. For the bundles, thiswill be problem, but one that can be dealt with on the IDV ornetCDF-Java side with a paramater lookup as discussed at the recentIDV Developers teleconference and which is outlined from the IDVperspective here:
https://mcidasv.ssec.wisc.edu/issues/11

With the new naming:

VAR_%d-%d-%d[_error][_L%d][_layer][_I%s_S%d][_D%d][_Prob_%s]
The three variables would have different names depending on whetherthey came from a grib1 or grib2 dataset. This would require theUnidata IDV programmers to redo all the alias and parameter defaulttables and require a more complicated lookup just to find the 500 hPageopotential height, sea level pressure and total_precipitation fielddepending on the dataset used. I think providing consistency betweengrib1 and grib2 datasets at the very least is an importantconsideration - in the end, it's all GRIB. GEMPAK and McIDAS (as wellas wgrib2 and NCL) create the same names for their variablesindependent of whether they came from Grib1 or 2.

There is simply no way to maintain grib1 and grib2 name compatibility, because 
of the table-driven nature of GRIB, and the fact that they use different tables.

Again, along with the problem, its also an opportunity to rethink how the 
aliases and color tables etc are done. Its possible I can add other attributes 
that will make this easier.

I do apologize for this fiasco. Ive just spent most of the last 4-6 months 
trying to dig our way out of this hole.

I fully support the notion of adding in the level information to thevariable name as is the case for Geopotential_height. I know forvariables like Temperature in the 4.2 scheme can provide differentresults depending on whether your grib files had a mixture of 2D and3D varaibles (Temperature = the one on pressure levels) or just 2Dvariables (Temperature = the one on height above ground level). Iunderstand the problems it creates on both the netCDF-Java/TDS sideand sometimes the IDV side (e.g. creating derived quantities) andthink that this change can be handled pretty well on the IDV side.
I support adding the accumulation interval for parameters likeTotal_precipitation above because now some variables have a mixture ofthe different types of intervals.
One of your arguments is that over time, names change and it'sdifficult to maintain tables. While that may be true for lesservariables, I would suggest that the most commonly used variable namesrarely change (Temperature, geopotential height, relative humidity, uand v wind components, etc). Unidata has always been in the businessof maintaining tables and that's part of the job it does to supportthe user community. While it's not easy, it is a necessary function ofthe services that Unidata provides. And, changing the names justpushes the work off to others at Unidata. Perhaps Unidata could lookat having common tables used by all it's software for consistency. Orperhaps Unidata could work with the NCL group and use their lookuptables?


We cant maintain tables for all centers. We could try to do so for
just NCEP, but its probably not the right thing to do. It sucks
resources that we dont have. It makes NCEP GRIB files different from
non-NCEP GRIB files. Really, we have to rethink this, not hack in
lookup tables that will never be 100% right.

NCL has adopted a similar variable naming scheme for similar reasons.

In the end, I would like to see the netCDF-Java library evolve to suitthe needs of the data providers, while also maintaining as muchbackward compatibility for the end users and software developers whorely on it. I think a lot of the ancillary information can be providedthrough variable attributes as it is in 4.2 (description, tablenumber, Discipline/Category/Parmeter, GRIB GDS/PDS information) as NCLdoes, but leave human readable variable names.
</quote>
Outside the IDV, I have been using the netCDF-Java library inconjunction with PyNIO to convert grib2 data to netCDF. I use thehuman-readable netCDF-Java 4.2 variable names on my output filesinstead of the PyNIO names because I believe that the users of myoutput would prefer to see those than something likeVAR_0-0-0_L6_I6_Hour_S194.

A very nice (but not unchanging) human readable string is in thelong_name. I understand its a pain to change to using that, but once youmake that change, I think your objections above should be resolved. Thetrick will be to have both the long_name and the (unchanging) variablename.


I'll be glad to work with the IDV team to help wherever I can.

Once again, I apologize for this trouble.

John

Follow-Ups:
- Re: [netcdf-java] GRIB variable name changes in 4.3
  - From: Don Murray

References:
- [netcdf-java] GRIB variable name changes in 4.3
  - From: John Caron
- Re: [netcdf-java] GRIB variable name changes in 4.3
  - From: Don Murray

2012 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-java archives: