John-
Thanks for taking the time to generate these lists. Comments are below:
On 2/15/12 11:14 AM, John Caron wrote:
bundles referencing motherlode could only do so using the "latest" resolver, and im not sure how extensive that is, but we could look at the motherlode logs. seems like local GRIB files are where the pain will be.
Jim Steenburg and I (and perhaps others) use the Best Time Series for several datasets and I think as the IDV evolves its time matching capabilities, the Best Time Series will become used more. For the Best Time Series, we use index offsets to get the latest data. Dave Dempsey also uses the forecast 0 data from the Constant Offset FMRC.
http://www.unidata.ucar.edu/blogs/developer/en/entry/indexed_data_access_and_coordinate
we can try to help the user choose new variable names, but the original bundle has to be able to change, or else its going to be useless eventually. Might as well do the right thing now, and add some UI to help evolve the bundle.
There will need to be enough information stored in the variables so that if the names change, a request can still find the appropriate data. For example, if I have a bundle that is accessing temperature on a pressure level, what can I use to always get that variable in the dataset if the variable name changes?
Also, how will you handle the FMRC stuff where older files were indexed with the old names. Does 4.3 still read the old index files?
Im not sure what you mean by the index files here? Do you mean gbx8? Or the cached xml files? In both cases, those are no longer used, and new indexed files (gbx9 and ncx) are created.
I think this is a moot point so ignore for now. It sounds like you are going to reindex everything for the 4.3.
One thing that would help is to generate a list of 4.2 variable names with the corresponding 4.3 names for all the GRIB datasets on motherlode. That could be used by the IDV for the lookup table.
unfortunately, the problem with using the human names is that they keep getting tweaked (because the tables keep getting tweaked) by WMO and especially NCEP. So they will just break again whenever that happens. Im leaning towards an NCL-like variable name that will be much more stable (though not guaranteed if we discover we are doing things wrong). The implication is that an application will want to use the description when letting users choose from a list, and the variable name when talking to the API. I think IDV is already doing this?
The NCL-like syntax (still evolving) is:
VAR_%d-%d-%d[_error][_L%d][_layer][_I%s_S%d][_D%d][_Prob_%s] L = level type S = stat type D = derived type
I did a quick scan and am not keen on the NCL like names. I would like to see the VAR part replaced with a string that describes the variable in some way like NCL does. I've been working with a lot of NCEP folks who swear by wgrib2 which uses the names of the variables in the last column of Table 4.2 for each parameter. These are the names that NCL uses as well.
Yes. GRIB1 only has parameter number. We will also need the table version, possibly center/subcenter. So we will have a different syntax for GRIB1, which i havent done yet.
I assume that the %d-%d-%d is the discipline, category and parameter info? For GRIB1, the discipline and category will not exist.
Will you still provide these as descriptive names in the attributes? The IDV uses them to categorize the variables in the Field Chooser. Could you send along a comparison of a couple of variables with the attributes listed so we can see how that has changed?
yes, i can add those.
> So its differrent from NCL in not using the time coordinate in the name.
You are using the time interval in the name for accumulations, so I'm not sure what you mean here.
Im attaching two lists, both are maps from old names to new names. the first list uses a "human readable" name constructed from the latest GRIB tables, the second uses the NCL-like syntax. (neither are complete or authoritative yet, and are only for GRIB-2). Ive included the grid description on the second list, so the mappings make more sense.
one thing to note is that the mapping is dataset dependent 15-20% of the time.
Could you give examples of where these differ? I'd like to understand if this is just in accumulation intervals or something more.
I will probably release the next CDM version using the NCL-syntax variable names in order to get feedback from the broader community.
Perhaps you could send this proposal with the examples out to the netCDF-Java list before sending out the code. People are already changing their code to test out the beta release and if it changes with the next beta, they'll have to do it again.
Thanks again.
Don
thanks for your feedback.
John
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.