[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: THREDDS repo and HDF5 datasets with huge attributes



Ok, maybe send Dana an email, and see if we can get some clarity, and maybe they can clarify their docs if needed.


On Thu, Apr 30, 2015 at 1:41 PM, Schmunk, Robert B. (GISS-611.0)[TRINNOVIM, LLC] <address@hidden> wrote:

John,

One SMAP dataset I have been using that has a huge attribute of length > 200 kB is at
https://www.dropbox.com/s/4q0iwbw8iv3puie/SMAP_L4_SM_aup_20140115T030000_V05007_001.h5?dl=0
The file is about 90 MB.

One thing about my fix: Youâll see that I have written the calculation as

      offset = h5.makeIntFromBytes(heapId, 1, (heapId.length-1));

but if I read Danaâs e-mail correctly, it should say

      offset = h5.makeIntFromBytes(heapId, 1, h5.sizeLengths);

With the sample dataset above, heapId.length-1 = 7 while h5.sizeLengths = 8. My experience is that with the latter, an array-out-of-bounds exception will get thrown, while my way works. So maybe Iâm not interpreting Danaâs e-mail right. :)

BTW: Iâll be filing another pull request related to the codesigning issue. The toolsUI-4.6.1-SNAPSHOT.jar is having the same problem that I reported about the netCDFAll-4.6.1-SNAPSHOT.jar a week or two back.

rbs





On Apr 30, 2015, at 15:27, John Caron <address@hidden> wrote:

> thanks a million. i think i have a sample file for unit tests, but if you have multiple variants, can you send them?
>
> On Thu, Apr 30, 2015 at 12:45 PM, Schmunk, Robert B. (GISS-611.0)[TRINNOVIM, LLC] <address@hidden> wrote:
>
> John,
>
> See pull request #131.
>
> Basically, the code for getting the Fractal Heap ID for a huge object was bad.
>
> Note that the code was also only trying to figure out the heap ID for a huge obect of subtype 1, but my reading of the file format specification (https://www.hdfgroup.org/HDF5/doc/H5.format.html) indicates that the same method should be applied to subtype 2. The sample SMAP files I was workingwith used subtype 1.
>
> rbs
>
>
>
> On Apr 30, 2015, at 11:26, John Caron <address@hidden> wrote:
>
> > Hi Robert:
> >
> > Just sent a note about the branch changes. Sorry for the abrupt switch.
> >
> > As for the damn H5 bug, I just got a note from THG with an updated explanation of that, but I havent had a chance to look at it yet. I will forward it to you. Would be truly awesome if you could find the problem and give us a pull request against master.
> >
> > John
> >
> > On Wed, Apr 29, 2015 at 8:01 PM, Schmunk, Robert B. (GISS-611.0)[TRINNOVIM, LLC] <address@hidden> wrote:
> >
> > John,
> >
> > Having not seen any pertinent activity on the THREDDS repository on Github regarding the bug in NJâs ability to read HDF5 datasets with huge attributes (using dense attribute storage), I have been trying to track down the bug myself. Itâs been a maze, comparing the NJ source code to the H5 format documentation, and Iâd rather not think about how long Iâve doing so. But nevertheless, I think I found where the problem is and how to fix it! Certainly I have managed to open some sample SMAP datasets with attributes of length > 100 kB.
> >
> > But on checking back on the THREDDS repo today, I found the 4.6.0 and 4.6.1 branches have both disappeared, which is very odd because they were there late last night. In fact, it seems like all the numbered branches have disappeared except for 4.5.6 and 5.0.0 (!?). Is something flaky going on with GitHub or is there something else happening behind the curtain with managing the NJ/THREDDS source code?
> >
> > rbs
> >
> >
> > --
> > Robert B. Schmunk
> > Webmaster / Senior Systems Programmer
> > NASA Goddard Institute for Space Studies
> > 2880 Broadway, New York, NY 10025
> >
> >
> >
>
> --
> Robert B. Schmunk
> Webmaster / Senior Systems Programmer
> NASA Goddard Institute for Space Studies
> 2880 Broadway, New York, NY 10025
>
>
>

--
Robert B. Schmunk
Webmaster / Senior Systems Programmer
NASA Goddard Institute for Space Studies
2880 Broadway, New York, NY 10025




NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.