Touche Don. (or should I say Dave Bowman)
This is a difficult issue for me to decide upon b/c in a sense, we are all
right. Can we achieve a parsing of the long name to the short name for
client displays etc.
John- care to chime in ?
On Wed, Feb 29, 2012 at 10:14 AM, Don Murray <don.murray@xxxxxxxx> wrote:
> Hi Glenn-
> Thanks for the response. What I hear you saying is that the underlying
> infrastructure that John is creating (i.e. the GribFeatureCollection) and
> the fixes to what's broken in the identification of the data (e.g. the
> break out of the variables on different accumlation times) will help you
> provide consistent results. I agree that these changes are necessary.
> However, I think the same thing can be achieved with the human readable
> variable names. There is no guarantee that the VAR_* names won't change in
> the future. As John discussed with me last week, if he finds a new PDS
> variable that he thinks is important, it could be added to the variable
> name and then we go through the pain again. That's no different than
> changing the human readable names. The lookup for creating consistent
> human readable names is already there to create the long name.
> Even with the human readable names, there will be pain for tool developers
> that access the data, because some names will change. It will require
> changes to the IDV, but at least they will be manageable. The permalinks in
> the Godiva WMS viewer that is part of the TDS will break because they use
> the variable name to get the data.
> I think the human readable names serve the end users better than the VAR_*
> names. For example, if I go to NOMADS now and go to a GRIB2 file and
> choose the OPeNDAP view, I get a list of variables that I can choose. Ex:
> The variables that are selectable are in bold letters and easy to read. I
> can quickly scroll through the page to find the variable I'm interested in.
> While the long_name is listed in lesser print, it doesn't stand out like
> the variable name does. In the new scheme, what will stand out on the page
> is lots of VAR_* names which all look similar. You could argue that no one
> uses this OPeNDAP interface, but I know that there are some who do.
> Or, if I go to the NetcdfSubsetService for a grib file on motherlode:
> I see human readable names. In the end, I don't see that the VAR_ names
> serve the end user.
> As someone on the IDV users list said, "Hal, who do you serve: machines or
> humans?" ;-)
> On 2/29/12 7:16 AM, Glenn Rutledge wrote:
>> Hi Don,
>> That is a very good question and I left that out in my response.
>> Long term access for users in archives means we constantly have to work
>> to fully document, understand, track down any data provenance issues,
>> and verifying (to a lessor degree), the data. What it says it is- it
>> actually is. Its just a form of quality assurance for users. Data
>> providers - especially 'real time' ones don't necessarily concern
>> themselves with these issues. They make a product- and move on. I'll
>> bet you are fully aware that the WOC/Gateway does not even provide a
>> complete DTG in the file name for many NWP products! I used to work w/
>> John Stackpole (great guy)- the original developer of Grib. He made grib
>> as a compact communications protocol- not, as I'll also bet you are also
>> aware, for archives.
>> NOMADS has about 1+ petabyte to manage for users- we serviced a growing
>> 550TB last year and we need to scale. By aggregating the data most used
>> by users (common state variables, most popular, etc.) we can allow
>> streaming of files/records that allows the 50K+ users and ~300 million
>> downloads per year on NOMADS much better. Methods such as
>> pre-staging/caching most requested data on disk from tape, etc. etc.
>> What John is attempting to do will facilitate the access for multiple
>> users, requesting multiple files using aggregations and other streaming
>> caching (I don't quite understand the details there). Now- we can't even
>> ascertain with any degree of confidence what is what- in order to even
>> be able to aggregate- let alone feel comfortable about the accuracy of
>> the data we are serving to users.
>> It does not really help users find data- per se. It will help users
>> have more confidence that a aggregated monthly mean product from CFSR is
>> mean for each cycle (0, 6, 12, ..) for individual days of the month (the
>> diurnals)- rather then a typical monthly mean avg'ed over the entire day.
>> hope that makes sense. I'm not sure what other impacts this will have
>> for us here - LAS? our TDS to ESGF capabilities? It's kinda scary, but
>> John's radical change looks to solve a major archive problem I do know
>> that. We will run 4.2 and 4.3 in parallel I will tell you that for
>> some time.
>> Best regards, Glenn
>> On Tue, Feb 28, 2012 at 2:19 PM, Don Murray <don.murray@xxxxxxxx
>> <mailto:don.murray@xxxxxxxx>> wrote:
>> Hi Glenn-
>> On 2/28/12 11:43 AM, Glenn Rutledge wrote:
>> John and Community-
>> While I do not represent the NCDC Archive, for the NCDC NOMADS
>> and our users, I must agree that the changes John is proposing will
>> facilitate the long term use of grib data. While painful to
>> client (software | decoders), the proposed change will allow our
>> (with a more scalable way) to -better find and use our data. I'll
>> suggest that if this is adopted, NOMADS servers could provide
>> both 4.2
>> and 4.3 versions to (give software developers time to adapt)
>> allow the
>> client-side to adapt.
>> Could you elaborate on how you see that the new variable names will
>> allow the users to better find and use your data versus the human
>> readable names? For example, if I want to get the 500 hPa heights
>> from a model in your archive, how will the new names facilitate that?
>> Don Murray
>> NOAA/ESRL/PSD and CIRES
>> 303-497-3596 <tel:303-497-3596>
>> Glenn K. Rutledge
>> Meteorologist/Physical Scientist
>> NOMADS Team Leader
>> National Climatic Data Center
>> Asheville, NC 28801
>> (828) 271-4097
>> nomads.ncdc.noaa.gov <http://nomads.ncdc.noaa.gov>
> Don Murray
> NOAA/ESRL/PSD and CIRES
Glenn K. Rutledge
NOMADS Team Leader
National Climatic Data Center
Asheville, NC 28801