[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: NJ and HDF



On 3/12/2012 2:04 PM, Schmunk, Robert B. (GISS-611.0)[SIGMA SPACE CORPORATION] wrote:
John,

A couple of issues have recently come up regarding how NJ handles HDF5
datasets as I have assisted some Panoply users trying to open and plot
HDF5 data.

First, the HDF5 spec allows for an offset superblock. A Panoply user was
trying to open an HDF file in which the superblock was offset 1,048,576
(512*2048) bytes, and the NJ library was throwing back an exception that
the dataset did not look like valid CDM. In poking around the NJ source code,
I found that the ucar.nc2.iosp.hdf5.H5header class is coded to look for an
offset superblock, but that it is also coded to stop checking beyond
maxHeaderPos = 500,000 bytes. Is there a particular reason why this value
has been chosen? Or that a max offset has been specified at all?

isValidFile() has to be fast, because its applied to all files. so scanning through arbitrarily large files is not a good idea. 500K is already, i think 11 disk seeks ( which is probably already too long, is you assume an average of 10ms seek time).

500K its just a guess as to how big an offset is actually in use.

could you add an alternative interface where the user specifies the file type? that could solve the problem, with some code tweaking.


Second, in a different case, the NJ library was throwing back just a
NullPointerException, with stack trace

    java.lang.NullPointerException
        at ucar.nc2.iosp.hdf5.H5header.extendDimension(H5header.java:536)
        at ucar.nc2.iosp.hdf5.H5header.findDimensionLists(H5header.java:476)
        at ucar.nc2.iosp.hdf5.H5header.makeNetcdfGroup(H5header.java:378)
        at ucar.nc2.iosp.hdf5.H5header.makeNetcdfGroup(H5header.java:390)
        at ucar.nc2.iosp.hdf5.H5header.makeNetcdfGroup(H5header.java:390)
        at ucar.nc2.iosp.hdf5.H5header.read(H5header.java:182)
        at ucar.nc2.iosp.hdf5.H5iosp.open(H5iosp.java:111)
        at ucar.nc2.NetcdfFile.<init>(NetcdfFile.java:1468)
        at ucar.nc2.NetcdfFile.open(NetcdfFile.java:870)
        at ucar.nc2.NetcdfFile.open(NetcdfFile.java:503)
        at 
ucar.nc2.dataset.NetcdfDataset.openOrAcquireFile(NetcdfDataset.java:693)
        at ucar.nc2.dataset.NetcdfDataset.openDataset(NetcdfDataset.java:426)
        at ucar.nc2.dataset.NetcdfDataset.acquireDataset(NetcdfDataset.java:521)
        at ucar.nc2.dataset.NetcdfDataset.acquireDataset(NetcdfDataset.java:498)
        at gov.nasa.giss.netcdf.NcDataset.init(NcDataset.java:119)

As best I can tell, this has to do with the handling of DIMENSION_LIST
attributes on the dataset variables, and something is going wrong because
these attributes have numeric values rather than string. If I open the
problem dataset with the HDFView app, it reports

    DIMENSION_LIST = 110144, 111080, 112488

and it turns out that 110144 corresponds to a timestamp variable, 111080
a latitude variable, and 112488 a longitude variable.

In this latter case, do you think this is a bug in NJ's nadling of the
DIMENSION_LIST attribute, or have the dataset writers done something they
shouldn't have?

not sure, can you send me the file?


BTW: The toolsUI 4.2 jar shows similar results when I use it to access the
problems datasets. However, I am unable to use the toolsUI 4.3 jar, as the
main application window never appears after the splash window goes away.
This appears to be a bean handling problem with the spring framework.

yes, we are fixing that problem, thanks.


rbs







--
Robert B. Schmunk
NASA Goddard Institute for Space Studies, 2880 Broadway, New York, NY 10025
212-678-5535