Re: [netcdfgroup] NetCDF external links

  • To: Roy Mendelssohn - NOAA Federal <roy.mendelssohn@xxxxxxxx>
  • Subject: Re: [netcdfgroup] NetCDF external links
  • From: Dennis Heimbigner <dmh@xxxxxxxx>
  • Date: Fri, 15 Jul 2016 11:11:33 -0600
I believe they are independent HDF5 files,
but I do not believe they are independent
netcdf-4 files.
Also, people impose all sorts of conventions
on top of netcdf files, but that is different
than building it into the netcdf spec.
Having said that, I would hope people would
continue to experiment with this so we can get
a better understanding of the consequences.
=Dennis Heimbigner
 Unidata

On 7/15/2016 9:51 AM, Roy Mendelssohn - NOAA Federal wrote:
> Hi Dennis:
> 
> maybe I misunderstood Aleksandar's email,  but he would seem to be implying 
> they are indeed stand-alone and independent files.   Sort of like a TDS 
> aggregation.
> 
> I also agree with what Ed said, as was the case wth groups, which 
> unfortunately was rejected a couple of years ago by CF,  that these are 
> things that people are doing anyway, and will  continue to do  (for groups, 
> almost all of the satellite data we get from NASA and NOAA have them), so it 
> wold seem preferable to come up with a standard way of doing so.
> 
> -Roy
> 
> 
>> On Jul 15, 2016, at 8:45 AM, Dennis Heimbigner <dmh@xxxxxxxx> wrote:
>>
>> Ed- the difference, I think, is that in the situation
>> you describe, each separate file is a legitimate
>> netcdf file that can be independently read.
>> This is not the case, AFAIK, for hdf virtual files.
>> =Dennis
>>
>> On 7/15/2016 3:53 AM, Ed Hartnett wrote:
>>> Denis,
>>>
>>> I think you have a very valid concern there. However, in practice, I think
>>> there are already a number of very important climate data sets which
>>> contain data in multiple files, including the data that describes the
>>> coordinate variables.
>>>
>>> One valid reason for this is to optimize IO performance in high performance
>>> computing applications, such as climate models. Due to the volume and
>>> complexity of some of the coordinate data, storing it in every file may
>>> have a significant storage and performance cost.
>>>
>>> So I think that the use of external netCDF variables (a.k.a. HDF5 datasets)
>>> is a worthwhile addition to the standard, as it may provide a standardized
>>> way to accomplish what is already being done according to a variety of
>>> local standards and practices.
>>>
>>> I suspect that most users will understand that storing data in multiple
>>> files carries additional risks, such as you mention. So such a capability
>>> should be used sparingly. But when it is needed, then it would be good if
>>> there were a standard way of doing it.
>>>
>>> Thanks,
>>> Ed
>>>
>>> On Fri, Jul 15, 2016 at 4:53 AM, Julian Kunkel <juliankunkel@xxxxxxxxxxxxxx>
>>> wrote:
>>>
>>>> Dear Tim,
>>>> I think an extension to HDF5 is possible to include an URI where the
>>>> file can be fetched automatically when the file does not exist on the
>>>> local system, yet.
>>>> Additionally, some information to ensure consistency (e.g. checksum)
>>>> when trying to open an external file should probably be included in
>>>> the attributes (optionally).
>>>>
>>>> I'm curious to understand the space (bandwidth) savings that you may
>>>> have using such a feature?
>>>> Could you quantify it (approximately)?
>>>>
>>>> To another response:
>>>> I believe the one file semantics should go away (anyway) in the long
>>>> term to allow to query data that is scattered on multiple files. i.e.,
>>>> you open once multiple dataset  by changing the file name, the system
>>>> then shows all the variables as they would belong to this virtual
>>>> file.
>>>> I consider this to be an intermediate step that offers transparent
>>>> access to such a collection when it has been defined a-priori.
>>>>
>>>> Thanks for the feedback & regards,
>>>> Julian
>>>>
>>>> https://www.hdfgroup.org/HDF5/Tutor/vds.html
>>>>
>>>> https://www.hdfgroup.org/HDF5/docNewFeatures/VDS/HDF5-VDS-requirements-use-cases-2014-12-10.pdf
>>>>
>>>>> On 07/14/2016 09:32 AM, Timothy Patterson wrote:
>>>>>>
>>>>>> We have a number of operational products based on fixed lat/lon grids
>>>> that
>>>>>> we disseminate in near-real time.
>>>>>>
>>>>>> The ability to be able to send the lat/lon grid once and link to it as a
>>>>>> coordinate variable from within the file would be very useful as it
>>>> would
>>>>>> save considerably on bandwidth costs while still keeping the products
>>>>>> user-friendly.
>>>>>>
>>>>>> So this would be a welcome development for our purposes.
>>>>>>
>>>>>> Tim
>>>>>>
>>>>>>
>>>>>>
>>>> _________________________________________________________________________________________
>>>>>> Dr. Tim Patterson
>>>>>> Instrument Data Simulation Expert
>>>>>> Product Engineering/Test Data Coordination
>>>>>> MTG Programme
>>>>>> GEO  Division
>>>>>>
>>>>>> EUMETSAT
>>>>>> Eumetsat-Allee 1
>>>>>> 64295 Darmstadt
>>>>>> Germany
>>>>>>
>>>>>> Tel: +49 6151 807 487
>>>>>> Fax: +49 6151 807 7
>>>>>> E-mail: timothy.patterson@xxxxxxxxxxxx
>>>>>> Web: www.eumetsat.int
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: netcdfgroup-bounces@xxxxxxxxxxxxxxxx
>>>>>> [mailto:netcdfgroup-bounces@xxxxxxxxxxxxxxxx] On Behalf Of Eugen Betke
>>>>>> Sent: Wednesday, July 13, 2016 1:09 PM
>>>>>> To: netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>> Subject: [netcdfgroup] NetCDF external links
>>>>>>
>>>>>> Dear NetCDF-Group,
>>>>>>
>>>>>> we have been working on NetCDF external link functionality. This allows
>>>>>> NetCDF applications to create dimension variables which values are
>>>> stored in
>>>>>> an external file. Therefore, it uses the HDF5 virtual dataset
>>>>>> (VDS) functionality. This is useful for, e.g., climate applications that
>>>>>> rely on a variable per file and timestep configuration. The idea is to
>>>> store
>>>>>> the grid in a separate file and link our data to this grid. We already
>>>> have
>>>>>> our first working version. You find the patch and the examples on our
>>>> page:
>>>>>>
>>>>>>
>>>>>>
>>>> http://wr.informatik.uni-hamburg.de/research/projects/bullio/netcdf_external_links/start
>>>>>>
>>>>>> Under the hood it uses HDF5 virtual datasets. VDS has the advantage of
>>>>>> being compatible to the functions that are supported by oridinary
>>>> datasets.
>>>>>> Therefore, files containing VDS should be supported by the most
>>>> software .
>>>>>>
>>>>>> There is a minor issue related to HDF5, the call H5F_try_close function
>>>>>> fails, when ncdump trys to read data from an external dimension. So far
>>>> we
>>>>>> found a workaround, but we will fix this issue.
>>>>>>
>>>>>> It would be great if external link functionality could be supported by
>>>>>> netCDF at some timepoint. We would like to improve our patch and for
>>>> that
>>>>>> reason we need your feedback. If you have some idea to the issue above,
>>>> we
>>>>>> would be grateful for each hint.
>>>>>>
>>>>>> Regards,
>>>>>> Eugen
>>>>>>
>>>>>> _______________________________________________
>>>>>> NOTE: All exchanges posted to Unidata maintained email lists are
>>>> recorded
>>>>>> in the Unidata inquiry tracking system and made publicly available
>>>> through
>>>>>> the web.  Users who post to any of the lists we maintain are reminded to
>>>>>> remove any personal information that they do not want to be made public.
>>>>>>
>>>>>>
>>>>>> netcdfgroup mailing list
>>>>>> netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>> For list information or to unsubscribe,  visit:
>>>>>> http://www.unidata.ucar.edu/mailing_lists/
>>>>>>
>>>>>> Any email message from EUMETSAT is sent in good faith but shall neither
>>>> be
>>>>>> binding nor construed as constituting a commitment by EUMETSAT, except
>>>> where
>>>>>> provided for in a written agreement or contract or if explicitly stated
>>>> in
>>>>>> the email. Please note that any views or opinions presented in this
>>>> email
>>>>>> are solely those of the sender and do not necessarily represent those of
>>>>>> EUMETSAT. This message and any attachments are intended for the sole
>>>> use of
>>>>>> the addressee(s) and may contain confidential and privileged
>>>> information .
>>>>>> Any unauthorised use, disclosure, dissemination or distribution (in
>>>> whole or
>>>>>> in part) of its contents is not permitted. If you received this message
>>>> in
>>>>>> error, please notify the sender and delete it from your system.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Eugen Betke
>>>>> Abteilung Forschung
>>>>> Deutsches Klimarechenzentrum GmbH (DKRZ)
>>>>> Bundesstraße 45a • D-20146 Hamburg • Germany
>>>>>
>>>>> Phone:  +49 40 460094-146
>>>>> Fax: +49 40 460094-270
>>>>> E-mail: betke@xxxxxxx
>>>>> URL: http://www.dkrz.de
>>>>>
>>>>> Geschäftsführer: Prof. Dr. Thomas Ludwig
>>>>> Sitz der Gesellschaft: Hamburg
>>>>> Amtsgericht Hamburg HRB 39784
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> http://wr.informatik.uni-hamburg.de/people/julian_kunkel
>>>>
>>>> _______________________________________________
>>>> NOTE: All exchanges posted to Unidata maintained email lists are
>>>> recorded in the Unidata inquiry tracking system and made publicly
>>>> available through the web.  Users who post to any of the lists we
>>>> maintain are reminded to remove any personal information that they
>>>> do not want to be made public.
>>>>
>>>>
>>>> netcdfgroup mailing list
>>>> netcdfgroup@xxxxxxxxxxxxxxxx
>>>> For list information or to unsubscribe,  visit:
>>>> http://www.unidata.ucar.edu/mailing_lists/
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> NOTE: All exchanges posted to Unidata maintained email lists are
>>> recorded in the Unidata inquiry tracking system and made publicly
>>> available through the web.  Users who post to any of the lists we
>>> maintain are reminded to remove any personal information that they
>>> do not want to be made public.
>>>
>>>
>>> netcdfgroup mailing list
>>> netcdfgroup@xxxxxxxxxxxxxxxx
>>> For list information or to unsubscribe,  visit: 
>>> http://www.unidata.ucar.edu/mailing_lists/ 
>>>
>>
>> _______________________________________________
>> NOTE: All exchanges posted to Unidata maintained email lists are
>> recorded in the Unidata inquiry tracking system and made publicly
>> available through the web.  Users who post to any of the lists we
>> maintain are reminded to remove any personal information that they
>> do not want to be made public.
>>
>>
>> netcdfgroup mailing list
>> netcdfgroup@xxxxxxxxxxxxxxxx
>> For list information or to unsubscribe,  visit: 
>> http://www.unidata.ucar.edu/mailing_lists/ 
> 
> **********************
> "The contents of this message do not reflect any position of the U.S. 
> Government or NOAA."
> **********************
> Roy Mendelssohn
> Supervisory Operations Research Analyst
> NOAA/NMFS
> Environmental Research Division
> Southwest Fisheries Science Center
> ***Note new address and phone***
> 110 Shaffer Road
> Santa Cruz, CA 95060
> Phone: (831)-420-3666
> Fax: (831) 420-3980
> e-mail: Roy.Mendelssohn@xxxxxxxx www: http://www.pfeg.noaa.gov/
> 
> "Old age and treachery will overcome youth and skill."
> "From those who have been given much, much will be expected" 
> "the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
> 



  • 2016 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: