Re: [netcdf-java] [CF-metadata] [netcdfgroup] [Hdf-forum] Detecting netCDF versus HDF5 -- PROPOSED SOLUTIONS --REQUEST FOR COMMENTS

  • To: Pedro Vicente <pedro.vicente@xxxxxxxxxxxxxxxxxx>
  • Subject: Re: [netcdf-java] [CF-metadata] [netcdfgroup] [Hdf-forum] Detecting netCDF versus HDF5 -- PROPOSED SOLUTIONS --REQUEST FOR COMMENTS
  • From: John Caron <jcaron1129@xxxxxxxxx>
  • Date: Fri, 22 Apr 2016 21:57:51 -0600
Here are the blogs:

http://www.unidata.ucar.edu/blogs/developer/en/entry/dimensions_scales
http://www.unidata.ucar.edu/blogs/developer/en/entry/dimension_scale2
http://www.unidata.ucar.edu/blogs/developer/en/entry/dimension_scales_part_3
http://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf4_shared_dimensions
http://www.unidata.ucar.edu/blogs/developer/en/entry/netcdf4_use_of_dimension_scales

On Fri, Apr 22, 2016 at 7:57 AM, Pedro Vicente <
pedro.vicente@xxxxxxxxxxxxxxxxxx> wrote:

> John
>
> >>>i have written various blogs on the unidata site about why netcdf4 !=
> hdf5, and what the unique signature for shared dimensions looks like, in
> >>>case you want details.
>
> yes, I am interested, I had the impression by looking at the code some
> years ago that netCDF writes some unique name attributes somewhere
>
> ----------------------
> Pedro Vicente
> pedro.vicente@xxxxxxxxxxxxxxxxxx
> https://twitter.com/_pedro__vicente
> http://www.space-research.org/
>
>
>
>
> ----- Original Message -----
> *From:* John Caron <jcaron1129@xxxxxxxxx>
> *To:* Pedro Vicente <pedro.vicente@xxxxxxxxxxxxxxxxxx>
> *Cc:* cf-metadata@xxxxxxxxxxxx ; Discussion forum for the NeXus data
> format <nexus@xxxxxxxxxxxxxxx> ; netcdfgroup@xxxxxxxxxxxxxxxx ; Dennis
> Heimbigner <dmh@xxxxxxxx> ; NetCDF-Java community
> <netcdf-java@xxxxxxxxxxxxxxxx>
> *Sent:* Thursday, April 21, 2016 11:11 PM
> *Subject:* Re: [CF-metadata] [netcdfgroup] [Hdf-forum] Detecting netCDF
> versus HDF5 -- PROPOSED SOLUTIONS --REQUEST FOR COMMENTS
>
> 1) I completely agree with the idea of adding system metadata that
> indicates the library version(s) that wrote the file.
>
> 2) the way shared dimensions are implemented by netcdf4 is a unique
> signature that would likely identify (100 - epsilon) % of real data files
> in the wild. One could add such detection to the netcdf4 and/or hdf5
> libraries, and/or write a utility program to detect.
>
> there are 2 variants:
>
> 2.1) one could write a netcdf4 file without shared dimensions, though im
> pretty sure no one does. but you could argue then that its fine to just
> treat it as an hdf5 file and read through hdf5 library
>
> 2.2) one could write a netcdf4 file with hdf5 library, if you knew what
> you are doing. i have heard of this happening. but then you could argue
> that its really a netcdf4 file and you should use netcdf library to read.
>
> i have written various blogs on the unidata site about why netcdf4 !=
> hdf5, and what the unique signature for shared dimensions looks like, in
> case you want details.
>
> On Thu, Apr 21, 2016 at 4:18 PM, Pedro Vicente <
> pedro.vicente@xxxxxxxxxxxxxxxxxx> wrote:
>
>> If you have hdf5 files that should be readable, then I will undertake to
>>> look at them and see what the problem is.
>>>
>>
>>
>> ok, thank you
>>
>> WRT to old files:  We could produce a utility that would redef the file
>>> and insert the
>>>      _NCProperties attribute. This would allow someone to wholesale
>>>      mark old files.
>>>
>>
>>
>> Excellent idea , Dennis
>>
>> ----------------------
>> Pedro Vicente
>> pedro.vicente@xxxxxxxxxxxxxxxxxx
>> https://twitter.com/_pedro__vicente
>> http://www.space-research.org/
>>
>>
>> ----- Original Message ----- From: <dmh@xxxxxxxx>
>> To: "Pedro Vicente" <pedro.vicente@xxxxxxxxxxxxxxxxxx>; <
>> cf-metadata@xxxxxxxxxxxx>; "Discussion forum for the NeXus data format" <
>> nexus@xxxxxxxxxxxxxxx>; <netcdfgroup@xxxxxxxxxxxxxxxx>
>> Sent: Thursday, April 21, 2016 5:02 PM
>> Subject: Re: [netcdfgroup] [Hdf-forum] Detecting netCDF versus HDF5 --
>> PROPOSED SOLUTIONS --REQUEST FOR COMMENTS
>>
>>
>> If you have hdf5 files that should be readable, then I will undertake to
>>> look at them and see what the problem is.
>>> WRT to old files:  We could produce a utility that would redef the file
>>> and insert the
>>>      _NCProperties attribute. This would allow someone to wholesale
>>>      mark old files.
>>> =Dennis Heimbigner
>>>   Unidata
>>>
>>>
>>> On 4/21/2016 2:17 PM, Pedro Vicente wrote:
>>>
>>>> Dennis
>>>>
>>>> I am in the process of adding a global attribute in the root group
>>>>>>>>
>>>>>>> that captures both the netcdf library version and the hdf5 library
>>>>> version
>>>>> whenever a netcdf file is created. The current  form is
>>>>> _NCProperties="version=...|netcdflibversion=...|hdflibversion=..."
>>>>>
>>>>
>>>>
>>>> ok, good to know, thank you
>>>>
>>>>
>>>> > 1. I am open to suggestions about changing the format or adding
>>>>>>> info > to it.
>>>>>>>
>>>>>>
>>>>
>>>> I personally don't care, anything that uniquely identifies a netCDF
>>>> file (HDF5 based) as such will work
>>>>
>>>>
>>>> 2. Of course this attribute will not exist in files written using older
>>>>>>>
>>>>>> versions of the netcdf library, but at least the process will have
>>>>> begun.
>>>>>
>>>>
>>>> yes
>>>>
>>>>
>>>> 3. This technically does not address the original issue because there
>>>>> exist
>>>>>      hdf5 files  not written by netcdf that are still compatible with
>>>>> and can be
>>>>>      read by netcdf. Not sure this case is important or not.
>>>>>
>>>>
>>>> there will always be HDF5 files  not written by netcdf that netCDF will
>>>> read as we are now.
>>>>
>>>> this is not really the issue, but you just made a further issue :-)
>>>>
>>>> the issue is that I would like an application that reads a netCDF (HDF5
>>>> based) file to decide to use the netCDF or HDF5 API.
>>>> your attribute writing will do , for future files.
>>>> for older nertCDF files there may be  a way to detect the current
>>>> attributes and data structures to see if we can make it "identify itself"
>>>> as netCDF. A bit of debugging will confirm that, since Dimension Scales
>>>> are used, that would be an (imperfect maybe) way to do it
>>>>
>>>> regarding the "further issue " above
>>>>
>>>> you could go one step further and for any HDF5 files  not written by
>>>> netcdf , you could make netCDF reject the file reading,
>>>> because it's not "netCDF compliant".
>>>> Since having netCDF read pure HDF5 files is not a problem (at least for
>>>> me), I don't know if you would want to do this, just an idea.
>>>> In my mind taking complexity and ambiguities of problems is always a
>>>> good thing
>>>>
>>>>
>>>> ah, I forgot one thing, related to this
>>>>
>>>>
>>>> In the past I have found several pure HDF5 files that netCDF failed in
>>>> reading.
>>>> Since netCDF is HDF5 binary compatible, one would expect that all HDF5
>>>> files will be read by netCDF.
>>>> Except if you specifically wrote something in the code that makes it to
>>>> fail if some condition is not met,
>>>> This was a while ago, I'll try to find those cases and I'll send a bug
>>>> report to the bug report email
>>>>
>>>> ----------------------
>>>> Pedro Vicente
>>>> pedro.vicente@xxxxxxxxxxxxxxxxxx
>>>> https://twitter.com/_pedro__vicente
>>>> http://www.space-research.org/
>>>>
>>>> ----- Original Message ----- From: <dmh@xxxxxxxx>
>>>> To: "Pedro Vicente" <pedro.vicente@xxxxxxxxxxxxxxxxxx>; "HDF Users
>>>> Discussion List" <hdf-forum@xxxxxxxxxxxxxxxxxx>; <
>>>> cf-metadata@xxxxxxxxxxxx>; "Discussion forum for the NeXus data
>>>> format" <nexus@xxxxxxxxxxxxxxx>; <netcdfgroup@xxxxxxxxxxxxxxxx>
>>>> Cc: "John Shalf" <jshalf@xxxxxxx>; <Richard.E.Ullman@xxxxxxxx>;
>>>> "Marinelli, Daniel J. (GSFC-5810)" <daniel.j.marinelli@xxxxxxxx>;
>>>> "Miller, Mark C." <miller86@xxxxxxxx>
>>>> Sent: Thursday, April 21, 2016 2:30 PM
>>>> Subject: Re: [netcdfgroup] [Hdf-forum] Detecting netCDF versus HDF5 --
>>>> PROPOSED SOLUTIONS --REQUEST FOR COMMENTS
>>>>
>>>>
>>>> I am in the process of adding a global attribute in the root group
>>>>> that captures both the netcdf library version and the hdf5 library
>>>>> version
>>>>> whenever a netcdf file is created. The current  form is
>>>>> _NCProperties="version=...|netcdflibversion=...|hdflibversion=..."
>>>>> Where version is the version of the _NCProperties attribute and the
>>>>> others
>>>>> are e.g. 1.8.18 or 4.4.1-rc1.
>>>>> Issues:
>>>>> 1. I am open to suggestions about changing the format or adding info
>>>>> to it.
>>>>> 2. Of course this attribute will not exist in files written using
>>>>> older versions
>>>>>     of the netcdf library, but at least the process will have begun.
>>>>> 3. This technically does not address the original issue because there
>>>>> exist
>>>>>      hdf5 files  not written by netcdf that are still compatible with
>>>>> and can be
>>>>>      read by netcdf. Not sure this case is important or not.
>>>>> =Dennis Heimbigner
>>>>>    Unidata
>>>>>
>>>>>
>>>>> On 4/21/2016 9:33 AM, Pedro Vicente wrote:
>>>>>
>>>>>> DETECTING HDF5 VERSUS NETCDF GENERATED FILES
>>>>>> REQUEST FOR COMMENTS
>>>>>> AUTHOR: Pedro Vicente
>>>>>>
>>>>>> AUDIENCE:
>>>>>> 1) HDF, netcdf developers,
>>>>>> Ed Hartnett
>>>>>> Kent Yang
>>>>>> 2) HDF, netcdf users, that replied to this thread
>>>>>> Miller, Mark C.
>>>>>> John Shalf
>>>>>> 3 ) netcdf tools developers
>>>>>> Mary Haley  , NCL
>>>>>> 4) HDF, netcdf managers and sponsors
>>>>>> David Pearah  , CEO HDF Group
>>>>>> Ward Fisher, UCAR
>>>>>> Marinelli, Daniel J. , Richard Ullmman, Christopher Lynnes, NASA
>>>>>> 5)
>>>>>> [CF-metadata] list
>>>>>> After this thread started 2 months ago, there was an annoucement on
>>>>>> the [CF-metadata] mail list
>>>>>> about
>>>>>> "a meeting to discuss current and future netCDF-CF efforts and
>>>>>> directions.
>>>>>> The meeting will be held on 24-26 May 2016 in Boulder, CO, USA at the
>>>>>> UCAR Center Green facility."
>>>>>> This would be a good topic to put on the agenda, maybe?
>>>>>> THE PROBLEM:
>>>>>> Currently it is impossible to detect if an HDF5 file was generated by
>>>>>> the HDF5 API or by the netCDF API.
>>>>>> See previous email about the reasons why.
>>>>>> WHY THIS MATTERS:
>>>>>> Software applications that need to handle both netCDF and HDF5 files
>>>>>> cannot decide which API to use.
>>>>>> This includes popular visualization tools like IDL, Matlab, NCL, HDF
>>>>>> Explorer.
>>>>>> SOLUTIONS PROPOSED: 2
>>>>>> SOLUTION 1: Add a flag to HDF5 source
>>>>>> The hdf5 format specification, listed here
>>>>>> https://www.hdfgroup.org/HDF5/doc/H5.format.html
>>>>>> describes a sequence of bytes in the file layout that have special
>>>>>> meaning for the HDF5 API. It is common practice, when designing a data
>>>>>> format,
>>>>>> so leave some fields "reserved for future use".
>>>>>> This solution makes use of one of these empty  "reserved for future
>>>>>> use" spaces to save a byte (for example) that describes an enumerator
>>>>>> of "HDF5 compatible formats".
>>>>>> An "HDF5 compatible format" is a data format that uses the HDF5 API
>>>>>> at a lower level (usually hidden from the user of the upper API),
>>>>>> and providing its own API.
>>>>>> This category can still be divide in 2 formats:
>>>>>> 1) A "pure HDF5 compatible format". Example, NeXus
>>>>>> http://www.nexusformat.org/
>>>>>> NeXus just writes some metadata (attributes) on top of the HDF5 API,
>>>>>> that has some special meaning for the NeXus community
>>>>>> 2) A "non pure HDF5 compatible format". Example, netCDF
>>>>>> Here, the format adds some extra feature besides HDF5. In the case of
>>>>>> netCDF, these are shared dimensions between variables.
>>>>>> This sub-division between 1) and 2) is irrelevant for the problem and
>>>>>> solution in question
>>>>>> The solution consists of writing a different enumerator value on the
>>>>>> "reserved for future use" space. For example
>>>>>> Value decimal 0 (current value): This file was generated by the HDF5
>>>>>> API (meaning the HDF5 only API)
>>>>>> Value decimal 1: This file was generated by the netCDF API (using
>>>>>> HDF5)
>>>>>> Value decimal 2: This file was generated by <put here another HDF5
>>>>>> based format>
>>>>>> and so on
>>>>>> The advantage of this solution is that this process involves 2
>>>>>> parties: the HDF Group and the other format's organization.
>>>>>> This allows the HDF Group to "keep track" of new HDF5 based formats.
>>>>>> It allows to make the other format "HDF5 certified" .
>>>>>> SOLUTION 2: Add some metadata to the other API on top of HDF5
>>>>>> This is what Nexus uses.
>>>>>> A Nexus file on creation writes several attributes on the root group,
>>>>>> like "NeXus_version" and other numeric data.
>>>>>> This is done using the public HDF5 API calls.
>>>>>> The solution for netCDF consists of the same approach, just write
>>>>>> some specific attributes, and a special netCDF API to write/read them.
>>>>>> This solutions just requires the work of one party (the netCDF group)
>>>>>> END OF RFC
>>>>>> In reply to people that commented in the thread
>>>>>> @John Shalf
>>>>>> >>Perhaps NetCDF (and other higher-level APIs that are built on top of
>>>>>> HDF5) should include an attribute attached
>>>>>> >>to the root group that identifies the name and version of the API
>>>>>> that created the file?  (adopt this as a convention)
>>>>>> yes, that's one way to do it, Solution 2 above
>>>>>> @Mark Miller
>>>>>> >>>Hmmm. Is there any big reason NOT to try to read a netCDF produced
>>>>>> HDF5 file with the native HDF5 library if someone so chooses?
>>>>>> It's possible to read a netCDF file using HDF5, yes.
>>>>>> There are 2 things that you will miss doing this:
>>>>>> 1) the ability to inquire about shared netCDF dimensions.
>>>>>> 2) the ability to read remotely with openDAP.
>>>>>> Reading with HDF5 also exposes metadata that is supposed to be
>>>>>> private to netCDF. See below
>>>>>> >>>> And, attempting  to read an HDF5 file produced by Silo using just
>>>>>> the HDF5 library (e.g. w/o Silo) is a major pain.
>>>>>> This I don't understand. Why not read the Silo file with the Silo API?
>>>>>> That's the all purpose of this issue, each higher level API on top of
>>>>>> HDF5 should be able to detect "itself".
>>>>>> I am not familiar with Silo, but if Silo cannot do this, then you
>>>>>> have the same design flaw that netCDF has.
>>>>>>
>>>>>> >>> In a cursory look over the libsrc4 sources in netCDF distro, I see
>>>>>> a few things that might give a hint a file was created with netCDF.  .
>>>>>> .
>>>>>> >>>> First, in NC_CLASSIC_MODEL, an attribute gets attached to the
>>>>>> root group named "_nc3_strict". So, the existence of an attribute on
>>>>>> the root group by that name would suggest the HDF5 file was generated by
>>>>>> netCDF.
>>>>>> I think this is done only by the "old" netCDF3 format.
>>>>>> >>>>> Also, I tested a simple case of nc_open, nc_def_dim, etc.
>>>>>> nc_close to see what it produced.
>>>>>> >>>> It appears to produce datasets for each 'dimension' defined with
>>>>>> two attributes named "CLASS" and "NAME".
>>>>>> This is because netCDF uses the HDF5 Dimension Scales API internally
>>>>>> to keep track of shared dimensions. These are internal attributes
>>>>>> of Dimension Scales. This approach would not work because an HDF5
>>>>>> only file with Dimension Scales would have the same attributes.
>>>>>>
>>>>>> >>>> I like John's suggestion here.
>>>>>> >>>>>But, any code you add to any applications now will work *only*
>>>>>> for files that were produced post-adoption of this convention.
>>>>>> yes. there are 2 actions to take here.
>>>>>> 1) fix the issue for the future
>>>>>> 2) try to retroactively have some workaround that makes possible now
>>>>>> to differentiate a HDF5/netCDF files made before the adopted convention
>>>>>> see below
>>>>>>
>>>>>> >>>> In VisIt, we support >140 format readers. Over 20 of those are
>>>>>> different variants of HDF5 files (H5part, Xdmf, Pixie, Silo, Samrai,
>>>>>> netCDF, Flash, Enzo, Chombo, etc., etc.)
>>>>>> >>>>When opening a file, how does VisIt figure out which plugin to
>>>>>> use? In particular, how do we avoid one poorly written reader plugin
>>>>>> (which may be the wrong one for a given file) from preventing the correct
>>>>>> one from being found. Its kinda a hard problem.
>>>>>>
>>>>>> Yes, that's the problem we are trying to solve. I have to say, that
>>>>>> is quick a list of HDF5 based formats there.
>>>>>> >>>> Some of our discussion is captured here. . .
>>>>>> http://www.visitusers.org/index.php?title=Database_Format_Detection
>>>>>> I"ll check it out, thank you for the suggestions
>>>>>> @Ed Hartnett
>>>>>> >>>I must admit that when putting netCDF-4 together I never considered
>>>>>> that someone might want to tell the difference between a "native"
>>>>>> HDF5 file and a netCDF-4/HDF5 file.
>>>>>> >>>>>Well, you can't think of everything.
>>>>>> This is a major design flaw.
>>>>>> If you are in the business of designing data file formats, one of the
>>>>>> things you have to do is how to make it possible to identify it from the
>>>>>> other formats.
>>>>>>
>>>>>> >>> I agree that it is not possible to canonically tell the
>>>>>> difference. The netCDF-4 API does use some special attributes to
>>>>>> track named dimensions,
>>>>>> >>>>and to tell whether classic mode should be enforced. But it can
>>>>>> easily produce files without any named dimensions, etc.
>>>>>> >>>So I don't think there is any easy way to tell.
>>>>>> I remember you wrote that code together with Kent Yang from the HDF
>>>>>> Group.
>>>>>> At the time I was with the HDF Group but unfortunately I did follow
>>>>>> closely what you were doing.
>>>>>> I don't remember any design document being circulated that explains
>>>>>> the internals of the "how to" make the netCDF (classic) model of shared
>>>>>> dimensions
>>>>>> use the hierarchical group model of HDF5.
>>>>>> I know this was done using the HDF5 Dimension Scales (that I wrote),
>>>>>> but is there any design document that explains it?
>>>>>> Maybe just some internal email exchange between you and Kent Yang?
>>>>>> Kent, how are you?
>>>>>> Do you remember having any design document that explains this?
>>>>>> Maybe something like a unique private attribute that is written
>>>>>> somewhere in the netCDF file?
>>>>>>
>>>>>> @Mary Haley, NCL
>>>>>> NCL is a widely used tool that handles both netCDF and HDF5
>>>>>> Mary, how are you?
>>>>>> How does NCL deal with the case of reading both pure HDF5 files and
>>>>>> netCDF files that use HDF5?
>>>>>> Would you be interested in joining a community based effort to deal
>>>>>> with this, in case this is an issue for you?
>>>>>>
>>>>>> @David Pearah  , CEO HDF Group
>>>>>> I volunteer to participate in the effort of this RFC together with
>>>>>> the HDF Group (and netCDF Group).
>>>>>> Maybe we could make a "task force" between HDF Group, netCDF Group
>>>>>> and any volunteer (such as tools developers that happen to be in these 
>>>>>> mail
>>>>>> lists)?
>>>>>> The "task force" would have 2 tasks:
>>>>>> 1) make a HDF5 based convention for the future and
>>>>>> 2) try to retroactively salvage the current design issue of netCDF
>>>>>> My phone is 217-898-9356, you are welcome to call in anytime.
>>>>>> ----------------------
>>>>>> Pedro Vicente
>>>>>> pedro.vicente@xxxxxxxxxxxxxxxxxx <mailto:
>>>>>> pedro.vicente@xxxxxxxxxxxxxxxxxx>
>>>>>> https://twitter.com/_pedro__vicente
>>>>>> http://www.space-research.org/
>>>>>>
>>>>>>     ----- Original Message -----
>>>>>>     *From:* Miller, Mark C. <mailto:miller86@xxxxxxxx>
>>>>>>     *To:* HDF Users Discussion List <mailto:
>>>>>> hdf-forum@xxxxxxxxxxxxxxxxxx>
>>>>>>     *Cc:* netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>>     <mailto:netcdfgroup@xxxxxxxxxxxxxxxx> ; Ward Fisher
>>>>>>     <mailto:wfisher@xxxxxxxx>
>>>>>>     *Sent:* Wednesday, March 02, 2016 7:07 PM
>>>>>>     *Subject:* Re: [Hdf-forum] Detecting netCDF versus HDF5
>>>>>>
>>>>>>     I like John's suggestion here.
>>>>>>
>>>>>>     But, any code you add to any applications now will work *only* for
>>>>>>     files that were produced post-adoption of this convention.
>>>>>>
>>>>>>     There are probably a bazillion files out there at this point that
>>>>>>     don't follow that convention and you probably still want your
>>>>>>     applications to be able to read them.
>>>>>>
>>>>>>     In VisIt, we support >140 format readers. Over 20 of those are
>>>>>>     different variants of HDF5 files (H5part, Xdmf, Pixie, Silo,
>>>>>>     Samrai, netCDF, Flash, Enzo, Chombo, etc., etc.) When opening a
>>>>>>     file, how does VisIt figure out which plugin to use? In
>>>>>>     particular, how do we avoid one poorly written reader plugin
>>>>>>     (which may be the wrong one for a given file) from preventing the
>>>>>>     correct one from being found. Its kinda a hard problem.
>>>>>>
>>>>>>     Some of our discussion is captured here. . .
>>>>>>
>>>>>> http://www.visitusers.org/index.php?title=Database_Format_Detection
>>>>>>
>>>>>>     Mark
>>>>>>
>>>>>>
>>>>>>     From: Hdf-forum <hdf-forum-bounces@xxxxxxxxxxxxxxxxxx
>>>>>>     <mailto:hdf-forum-bounces@xxxxxxxxxxxxxxxxxx>> on behalf of John
>>>>>>     Shalf <jshalf@xxxxxxx <mailto:jshalf@xxxxxxx>>
>>>>>>     Reply-To: HDF Users Discussion List <hdf-forum@xxxxxxxxxxxxxxxxxx
>>>>>>     <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>>
>>>>>>     Date: Wednesday, March 2, 2016 1:02 PM
>>>>>>     To: HDF Users Discussion List <hdf-forum@xxxxxxxxxxxxxxxxxx
>>>>>>     <mailto:hdf-forum@xxxxxxxxxxxxxxxxxx>>
>>>>>>     Cc: "netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>>     <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>"
>>>>>>     <netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>>     <mailto:netcdfgroup@xxxxxxxxxxxxxxxx>>, Ward Fisher
>>>>>>     <wfisher@xxxxxxxx <mailto:wfisher@xxxxxxxx>>
>>>>>>     Subject: Re: [Hdf-forum] Detecting netCDF versus HDF5
>>>>>>
>>>>>>         Perhaps NetCDF (and other higher-level APIs that are built on
>>>>>>         top of HDF5) should include an attribute attached to the root
>>>>>>         group that identifies the name and version of the API that
>>>>>>         created the file?  (adopt this as a convention)
>>>>>>
>>>>>>         -john
>>>>>>
>>>>>>             On Mar 2, 2016, at 12:55 PM, Pedro Vicente
>>>>>>             <pedro.vicente@xxxxxxxxxxxxxxxxxx
>>>>>> <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>> wrote:
>>>>>>             Hi Ward
>>>>>>             As you know, Data Explorer is going to be a general
>>>>>>             purpose data reader for many formats, including HDF5 and
>>>>>>             netCDF.
>>>>>>             Here
>>>>>>             http://www.space-research.org/
>>>>>>             Regarding the handling of both HDF5 and netCDF, it seems
>>>>>>             there is a potential issue, which is, how to tell if any
>>>>>>             HDF5 file was saved by the HDF5 API or by the netCDF API?
>>>>>>             It seems to me that this is not possible. Is this correct?
>>>>>>             netCDF uses an internal function NC_check_file_type to
>>>>>>             examine the first few bytes of a file, and for example for
>>>>>>             any HDF5 file the test is
>>>>>>             /* Look at the magic number */
>>>>>>                /* Ignore the first byte for HDF */
>>>>>>                if(magic[1] == 'H' && magic[2] == 'D' && magic[3] ==
>>>>>> 'F') {
>>>>>>                  *filetype = FT_HDF;
>>>>>>                  *version = 5;
>>>>>>             The problem is that this test works for any HDF5 file and
>>>>>>             for any netCDF file, which makes it impossible to tell
>>>>>>             which is which.
>>>>>>             Which makes it impossible for any general purpose data
>>>>>>             reader to decide to use the netCDF API or the HDF5 API.
>>>>>>             I have a possible solution for this , but before going any
>>>>>>             further, I would just like to confirm that
>>>>>>             1)      Is indeed not possible
>>>>>>             2)      See if you have a solid workaround for this,
>>>>>>             excluding the dumb ones, for example deciding on a
>>>>>>             extension .nc or .h5, or traversing the HDF5 file to see
>>>>>>             if it's non netCDF conforming one. Yes, to further
>>>>>>             complicate things, it is possible that the above test says
>>>>>>             OK for a HDF5 file, but then the read by the netCDF API
>>>>>>             fails because the file is a HDF5 non netCDF conformant
>>>>>>             Thanks
>>>>>>             ----------------------
>>>>>>             Pedro Vicente
>>>>>>             pedro.vicente@xxxxxxxxxxxxxxxxxx
>>>>>>             <mailto:pedro.vicente@xxxxxxxxxxxxxxxxxx>
>>>>>>             http://www.space-research.org/
>>>>>>             _______________________________________________
>>>>>>             Hdf-forum is for HDF software users discussion.
>>>>>>             Hdf-forum@xxxxxxxxxxxxxxxxxx
>>>>>>             <mailto:Hdf-forum@xxxxxxxxxxxxxxxxxx>
>>>>>>
>>>>>>
>>>>>> http://secure-web.cisco.com/1r-EJFFfg6rWlpQsvXstBNTjaHQaKT_NkYRN0Jj_f-Z3EK0-hs6IbYc8XUBRyPsH3mU3CS0iiY7_qnchCA0QxNzQt270d_2HikCwpAWFmuHdacin62eaODutktDSOULIJmVbVYqFVSKWPzoX7kdP0yN9wIzSFxZfTwfhU8ebsN409xRg1PsW_8cvNiWzxDNm9wv9yBf9yK6nkEm-bOx2S0kBLbg9WfIChWzZrkpE3AHU9I-c2ZRH_IN-UF4g_g0_Dh4qE1VETs7tZTfKd1ox1MtBmeyKf7EKUCd3ezR9EbI5tK4hCU5qW4v5WWOxOrD17e8yCVmob27xz84Lr3bCK5wIQdH5VzFRTtyaAhudpt9E/http%3A%2F%2Flists.hdfgroup.org%2Fmailman%2Flistinfo%2Fhdf-forum_lists.hdfgroup.org
>>>>>>             Twitter: https://twitter.com/hdf5
>>>>>>
>>>>>>
>>>>>>
>>>>>>         _______________________________________________
>>>>>>         Hdf-forum is for HDF software users discussion.
>>>>>>         Hdf-forum@xxxxxxxxxxxxxxxxxx <mailto:
>>>>>> Hdf-forum@xxxxxxxxxxxxxxxxxx>
>>>>>>
>>>>>>
>>>>>> http://secure-web.cisco.com/1r-EJFFfg6rWlpQsvXstBNTjaHQaKT_NkYRN0Jj_f-Z3EK0-hs6IbYc8XUBRyPsH3mU3CS0iiY7_qnchCA0QxNzQt270d_2HikCwpAWFmuHdacin62eaODutktDSOULIJmVbVYqFVSKWPzoX7kdP0yN9wIzSFxZfTwfhU8ebsN409xRg1PsW_8cvNiWzxDNm9wv9yBf9yK6nkEm-bOx2S0kBLbg9WfIChWzZrkpE3AHU9I-c2ZRH_IN-UF4g_g0_Dh4qE1VETs7tZTfKd1ox1MtBmeyKf7EKUCd3ezR9EbI5tK4hCU5qW4v5WWOxOrD17e8yCVmob27xz84Lr3bCK5wIQdH5VzFRTtyaAhudpt9E/http%3A%2F%2Flists.hdfgroup.org%2Fmailman%2Flistinfo%2Fhdf-forum_lists.hdfgroup.org
>>>>>>         Twitter: https://twitter.com/hdf5
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>>
>>>>>>     _______________________________________________
>>>>>>     Hdf-forum is for HDF software users discussion.
>>>>>>     Hdf-forum@xxxxxxxxxxxxxxxxxx
>>>>>>
>>>>>>
>>>>>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>>>>>>     Twitter: https://twitter.com/hdf5
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> netcdfgroup mailing list
>>>>>> netcdfgroup@xxxxxxxxxxxxxxxx
>>>>>> For list information or to unsubscribe,  visit:
>>>>>> http://www.unidata.ucar.edu/mailing_lists/
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>> _______________________________________________
>> CF-metadata mailing list
>> CF-metadata@xxxxxxxxxxxx
>> http://mailman.cgd.ucar.edu/mailman/listinfo/cf-metadata
>>
>
>
  • 2016 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdf-java archives: