Re: [thredds] [netcdf-java] GRIB variable name changes in 4.3

Hi Glenn-

On 2/29/12 8:40 AM, Glenn Rutledge wrote:
Touche Don. (or should I say Dave Bowman)

;-)

This is a difficult issue for me to decide upon b/c in a sense, we are
all right.  Can we achieve a parsing of the long name to the short name
for client displays etc.

And vice-versa (lookup by long name). However, John has already said that the long name is volatile so I'm not sure what relying on the long name gets us.

John- care to chime in ?

On Wed, Feb 29, 2012 at 10:14 AM, Don Murray <don.murray@xxxxxxxx
<mailto:don.murray@xxxxxxxx>> wrote:

    Hi Glenn-

    Thanks for the response.  What I hear you saying is that the
    underlying infrastructure that John is creating (i.e. the
    GribFeatureCollection) and the fixes to what's broken in the
    identification of the data (e.g. the break out of the variables on
    different accumlation times) will help you provide consistent
    results.  I agree that these changes are necessary.

    However, I think the same thing can be achieved with the human
    readable variable names.  There is no guarantee that the VAR_* names
    won't change in the future.  As John discussed with me last week, if
    he finds a new PDS variable that he thinks is important, it could be
    added to the variable name and then we go through the pain again.
      That's no different than changing the human readable names.  The
    lookup for creating consistent human readable names is already there
    to create the long name.

    Even with the human readable names, there will be pain for tool
    developers that access the data, because some names will change.  It
    will require changes to the IDV, but at least they will be
    manageable. The permalinks in the Godiva WMS viewer that is part of
    the TDS will break because they use the variable name to get the data.

    I think the human readable names serve the end users better than the
    VAR_* names.  For example, if I go to NOMADS now and go to a GRIB2
    file and choose the OPeNDAP view, I get a list of variables that I
    can choose. Ex:

    
http://nomads.ncdc.noaa.gov/__thredds/dodsC/gfs4/201202/__20120229/gfs_4_20120229_0000___180.grb2.html
    
<http://nomads.ncdc.noaa.gov/thredds/dodsC/gfs4/201202/20120229/gfs_4_20120229_0000_180.grb2.html>

    The variables that are selectable are in bold letters and easy to
    read.  I can quickly scroll through the page to find the variable
    I'm interested in. While the long_name is listed in lesser print, it
    doesn't stand out like the variable name does.  In the new scheme,
    what will stand out on the page is lots of VAR_* names which all
    look similar. You could argue that no one uses this OPeNDAP
    interface, but I know that there are some who do.

    Or, if I go to the NetcdfSubsetService for a grib file on motherlode:

    
http://motherlode.ucar.edu/__thredds/ncss/grid/fmrc/NCEP/__GFS/Global_onedeg/files/GFS___Global_onedeg_20120229_0600.__grib2/dataset.html
    
<http://motherlode.ucar.edu/thredds/ncss/grid/fmrc/NCEP/GFS/Global_onedeg/files/GFS_Global_onedeg_20120229_0600.grib2/dataset.html>

    I see human readable names. In the end, I don't see that the VAR_
    names serve the end user.

    As someone on the IDV users list said, "Hal, who do you serve:
    machines or humans?" ;-)

    Don



    On 2/29/12 7:16 AM, Glenn Rutledge wrote:

        Hi Don,
        That is a very good question and I left that out in my response.

        Long term access for users in archives means we constantly have
        to work
        to fully document, understand, track down any data provenance
        issues,
        and verifying (to a lessor degree), the data.  What it says it
        is- it
        actually is.  Its just a form of quality assurance for users.  Data
        providers - especially 'real time' ones don't necessarily concern
        themselves with these issues. They make a product- and move on.
          I'll
        bet you are fully aware that the WOC/Gateway does not even provide a
        complete DTG in the file name for many NWP products!  I used to
        work w/
        John Stackpole (great guy)- the original developer of Grib. He
        made grib
        as a compact communications protocol- not, as I'll also bet you
        are also
        aware, for archives.

        NOMADS has about 1+ petabyte to manage for users- we serviced a
        growing
        550TB last year and we need to scale.  By aggregating the data
        most used
        by users (common state variables, most popular, etc.)  we can allow
        streaming of files/records that allows the 50K+ users and ~300
        million
        downloads per year on NOMADS much better. Methods such as
        pre-staging/caching most requested data on disk from tape, etc. etc.

        What John is attempting to do will facilitate the access for
        multiple
        users, requesting multiple files using aggregations and other
        streaming
        caching (I don't quite understand the details there). Now- we
        can't even
        ascertain with any degree of confidence what is what- in order
        to even
        be able to aggregate- let alone feel comfortable about the
        accuracy of
        the data we are serving to users.

        It does not really help users find data- per se.  It will help users
        have more confidence that a aggregated monthly mean product from
        CFSR is
        mean for each cycle (0, 6, 12, ..) for individual days of the
        month (the
        diurnals)- rather then a typical monthly mean avg'ed over the
        entire day.

        hope that makes sense.   I'm not sure what other impacts this
        will have
        for us here - LAS? our TDS to ESGF capabilities?  It's kinda
        scary, but
        John's radical change looks to solve a major archive problem I
        do know
        that.   We will run 4.2 and 4.3 in parallel I will tell you that for
        some time.

        Best regards, Glenn

        On Tue, Feb 28, 2012 at 2:19 PM, Don Murray <don.murray@xxxxxxxx
        <mailto:don.murray@xxxxxxxx>
        <mailto:don.murray@xxxxxxxx <mailto:don.murray@xxxxxxxx>>> wrote:

            Hi Glenn-


            On 2/28/12 11:43 AM, Glenn Rutledge wrote:

                John and Community-
                While I do not represent the NCDC Archive, for the NCDC
        NOMADS
                systems
                and our users, I must agree that the changes John is
        proposing will
                facilitate the long term use of grib data.  While painful to
                (existing)
                client (software | decoders), the proposed change will
        allow our
                users
                (with a more scalable way) to -better find and use our
        data.  I'll
                suggest that if this is adopted, NOMADS servers could
        provide
                both 4.2
                and 4.3 versions to (give software developers time to adapt)
                allow the
                client-side to adapt.


            Could you elaborate on how you see that the new variable
        names will
            allow the users to better find and use your data versus the
        human
            readable names?  For example, if I want to get the 500 hPa
        heights
            from a model in your archive, how will the new names
        facilitate that?

            Don

            --
            Don Murray
            NOAA/ESRL/PSD and CIRES
        303-497-3596 <tel:303-497-3596> <tel:303-497-3596
        <tel:303-497-3596>>
        http://www.esrl.noaa.gov/psd/____people/don.murray/
        <http://www.esrl.noaa.gov/psd/__people/don.murray/>

        <http://www.esrl.noaa.gov/psd/__people/don.murray/
        <http://www.esrl.noaa.gov/psd/people/don.murray/>>




        --
        Glenn K. Rutledge
        Meteorologist/Physical Scientist
        NOMADS Team Leader
        National Climatic Data Center
        Asheville, NC 28801
        (828) 271-4097 <tel:%28828%29%20271-4097>
        nomads.ncdc.noaa.gov <http://nomads.ncdc.noaa.gov>
        <http://nomads.ncdc.noaa.gov>


    --
    Don Murray
    NOAA/ESRL/PSD and CIRES
    303-497-3596 <tel:303-497-3596>
    http://www.esrl.noaa.gov/psd/__people/don.murray/
    <http://www.esrl.noaa.gov/psd/people/don.murray/>




--
Glenn K. Rutledge
Meteorologist/Physical Scientist
NOMADS Team Leader
National Climatic Data Center
Asheville, NC 28801
(828) 271-4097 <tel:%28828%29%20271-4097>
nomads.ncdc.noaa.gov <http://nomads.ncdc.noaa.gov>


--
Don Murray
NOAA/ESRL/PSD and CIRES
303-497-3596
http://www.esrl.noaa.gov/psd/people/don.murray/



  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: