[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: THREDDS repo and HDF5 datasets with huge attributes



John,

One SMAP dataset I have been using that has a huge attribute of length > 200 kB 
is at
https://www.dropbox.com/s/4q0iwbw8iv3puie/SMAP_L4_SM_aup_20140115T030000_V05007_001.h5?dl=0
The file is about 90 MB.

One thing about my fix: You’ll see that I have written the calculation as

            offset = h5.makeIntFromBytes(heapId, 1, (heapId.length-1));

but if I read Dana’s e-mail correctly, it should say

            offset = h5.makeIntFromBytes(heapId, 1, h5.sizeLengths);

With the sample dataset above, heapId.length-1 = 7 while h5.sizeLengths = 8. My 
experience is that with the latter, an array-out-of-bounds exception will get 
thrown, while my way works. So maybe I’m not interpreting Dana’s e-mail right. 
:)

BTW: I’ll be filing another pull request related to the codesigning issue. The 
toolsUI-4.6.1-SNAPSHOT.jar is having the same problem that I reported about the 
netCDFAll-4.6.1-SNAPSHOT.jar a week or two back.

rbs





On Apr 30, 2015, at 15:27, John Caron <address@hidden> wrote:

> thanks a million. i think i have a sample file for unit tests, but if you 
> have multiple variants, can you send them?
> 
> On Thu, Apr 30, 2015 at 12:45 PM, Schmunk, Robert B. (GISS-611.0)[TRINNOVIM, 
> LLC] <address@hidden> wrote:
> 
> John,
> 
> See pull request #131.
> 
> Basically, the code for getting the Fractal Heap ID for a huge object was bad.
> 
> Note that the code was also only trying to figure out the heap ID for a huge 
> obect of subtype 1, but my reading of the file format specification 
> (https://www.hdfgroup.org/HDF5/doc/H5.format.html) indicates that the same 
> method should be applied to subtype 2. The sample SMAP files I was 
> workingwith used subtype 1.
> 
> rbs
> 
> 
> 
> On Apr 30, 2015, at 11:26, John Caron <address@hidden> wrote:
> 
> > Hi Robert:
> >
> > Just sent a note about the branch changes. Sorry for the abrupt switch.
> >
> > As for the damn H5 bug, I just got a note from THG with an updated 
> > explanation of that, but I havent had a chance to look at it yet. I will 
> > forward it to you. Would be truly awesome if you could find the problem and 
> > give us a pull request against master.
> >
> > John
> >
> > On Wed, Apr 29, 2015 at 8:01 PM, Schmunk, Robert B. (GISS-611.0)[TRINNOVIM, 
> > LLC] <address@hidden> wrote:
> >
> > John,
> >
> > Having not seen any pertinent activity on the THREDDS repository on Github 
> > regarding the bug in NJ’s ability to read HDF5 datasets with huge 
> > attributes (using dense attribute storage), I have been trying to track 
> > down the bug myself. It’s been a maze, comparing the NJ source code to the 
> > H5 format documentation, and I’d rather not think about how long I’ve doing 
> > so. But nevertheless, I think I found where the problem is and how to fix 
> > it! Certainly I have managed to open some sample SMAP datasets with 
> > attributes of length > 100 kB.
> >
> > But on checking back on the THREDDS repo today, I found the 4.6.0 and 4.6.1 
> > branches have both disappeared, which is very odd because they were there 
> > late last night. In fact, it seems like all the numbered branches have 
> > disappeared except for 4.5.6 and 5.0.0 (!?). Is something flaky going on 
> > with GitHub or is there something else happening behind the curtain with 
> > managing the NJ/THREDDS source code?
> >
> > rbs
> >
> >
> > --
> > Robert B. Schmunk
> > Webmaster / Senior Systems Programmer
> > NASA Goddard Institute for Space Studies
> > 2880 Broadway, New York, NY 10025
> >
> >
> >
> 
> --
> Robert B. Schmunk
> Webmaster / Senior Systems Programmer
> NASA Goddard Institute for Space Studies
> 2880 Broadway, New York, NY 10025
> 
> 
> 

--
Robert B. Schmunk
Webmaster / Senior Systems Programmer
NASA Goddard Institute for Space Studies
2880 Broadway, New York, NY 10025