[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Path Forward on Compression/Redundancy





On 3/13/2013 12:31 PM, Dan Kowal - NOAA Federal wrote:
Sorry to bother you again.  But our GOES-R Program POC contacted us with
this update:

Apologies for forwarding an incorrect reply to you yesterday.  The PD
PSDF libraries that generate the netCDF4 files use the C++ interface not
Java.  Mea culpa.____

__ __

Having said that, Harris had to modify the C++ netCDF4 libraries and the
necessary modifications were fed back to Unidata.  Harris PD developers
said last week that they had being following up with Unidata on this,
they did not think they were getting anywhere and had not followed up
for some time. ____

__ __

A different group within Harris are also talking to Unidata about
changes to the CF conventions to support remotely sensed data in
addition to the NWP and climate model metadata traditionally supported
by CF.____

__ __

Warning: Speculation follows:: ____

I would expect that Java HDF5 libraries could read the compressed
netCDF4 files given the relationship between the two formats.  No?____

Java HDF5 (from hdfgroup) may have some trouble with netcdf-4 files, because of the shared dimensions. You might want to contact them if tats important.

Java-NetCDF (from unidata) will work fine.



/speculation____

__ __

If NCDC and NGDC need to have their inputs heard on chunking, data
deletion, compression etc. please ensure that your comments are
forwarded to the highest levels within NOAA.  Robin Krause is looking
into a POC for this, which I will forward to you once received.  Please
do so soon as the decision could be made quickly.


Looks as if they are going about this differently.  So if you have any
response to this, let me know.  I imagine now that you'd like to see
some "compressed" data examples they have used in their study. I can put
in a request for this.

Id be happy to have a look at it at the appropriate time. Im pretty confident there will be no problems.




Dan


On Wed, Mar 13, 2013 at 11:31 AM, John Caron <address@hidden
<mailto:address@hidden>> wrote:

    Hi Dan:

    The ability to write to netcdf-4 (and thus use compression) was just
    added to netcdf-java 4.3.  It would be good to send me an example
    file so I can check that everything is ok, before you go into any
    kind of production

    John


    On 3/13/2013 10:48 AM, Dan Kowal - NOAA Federal wrote:

        Hi John,

        Yes, I'll pass along plans once I know them. Thanks for passing
        along
        Russ's info on chunking.  At his point, no consideration has
        been given
        yet on the chunk size for the space wx time series data.  I will ask
        that we be included in the discussions for determining chunk  size
        although at his point, I'm not sure what guidance to give as the
        granule
        sizes are small to begin with, ranging from 10s of KB to 5 MB -
        the last
        and largest granule being a solar image. In terms of how they
        are going
        about the compression, it was explained this way to me:

        2) __Harris is using the unidata netCDF4 library for Java to write
        compressed data to files.
        
http://www.unidata.ucar.edu/__software/netcdf-java/v4.0/__javadocAll/overview-summary.__html____
        
<http://www.unidata.ucar.edu/software/netcdf-java/v4.0/javadocAll/overview-summary.html____>


        Is this a concern? Should I inquire about the method they are using?

        Thanks,

        Dan

        On Wednesday, March 13, 2013, John Caron wrote:

             Hi Dan:

             On 3/12/2013 5:36 PM, Dan Kowal - NOAA Federal wrote:

                 Hi John,

                 Long time.  Two questions I have for you.  One
        personal, one
                 business.  Start with the first one.  As you may know,
        Ted is
                 "retiring" from NOAA, last day is Friday, 3/29.  I'm
        going to
                 try an organize some kind of FAC after work to
        celebrate his
                 time here before he heads off to HDF land.  Would this be
                 something you'd be interested in participating in?  And are
                 there other folks at unidata and elsewhere I should invite?
                   I've met so many ucar folks through the years with
        Ted, but
                 you're the standout in my mind besides Dave Fulker who
        I guess
                 runs OpenDap now? At any rate, let me know.


             Ive got it on my calendar, ill try to make it, though i have a
             likely conflict. Send me details when you get them, and I
        will pass
             on to others at Unidata who may be interested.


                 Now on to business. I've been working with the GOES-R
        program on
                 archiving space weather data, and given budget
        realities these
                 days, they are always looking at ways to cut costs.
          One area
                 they've been looking into is data compression, and as
        you can
                 see below from the email banter, they are using Zlib
        compression
                 that's available with NetCDF4.  Although they state
        that the
                 contractor is using the available Java libraries to
        write the
                 compressed data, is it fair to say that there will be no
                 problems reading the compressed format with the NetCDF
        API for
                 Java?  Let me know if there's anything to consider here...


             As long as you are talking about Zlib (deflate) and not Slib
             (proprietary) then I think its a good idea. Netcdf-Java handles
             deflate compression just fine. The main thing to investigate is
             choosing your chunking strategy, as that can have a strong
        effect on
             performance.

             Russ Rew has been working on this, and has blogged about it
        recently:

        http://www.unidata.ucar.edu/____blogs/developer/en/category/____NetCDF
        <http://www.unidata.ucar.edu/__blogs/developer/en/category/__NetCDF>


        <http://www.unidata.ucar.edu/__blogs/developer/en/category/__NetCDF
        <http://www.unidata.ucar.edu/blogs/developer/en/category/NetCDF>>

             BTW, you say that the "contractor is using the available Java
             libraries to write the compressed data". Im guessing that
        they are
             using the HDF java interface to the HDF5 library, not the
             Java-Netcdf library?

             John




        --

        Dan Kowal
        IT Specialist (Data Management)

        National Geophysical Data Center/NOAA
        Boulder, Colorado
        (303) 497-6118 <tel:%28303%29%20497-6118>

        address@hidden <mailto:address@hidden>
          <mailto:address@hidden <mailto:address@hidden>>





--

Dan Kowal
IT Specialist (Data Management)

National Geophysical Data Center/NOAA
Boulder, Colorado
(303) 497-6118

address@hidden  <mailto:address@hidden>