[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #UNZ-282706]: NetCDF / HDF chunks with all fill values



I think Sean is correct, and such functionality does not exist.
It is not a bad idea, but implementing it might be difficult.
For HDF5, since it stores data using b-trees, I think it allows
holes in the tree to indicate unwritten (or all-fill?) chunks.
But as far as I know, that information is not available thru
the API.
One possibility is to use a special kind of compressor that,
when a chunk is written and the chunk is all-fill, it writes a
very short compressed special value. On reading, the returned
uncompressed chunk has some special tag to indicate that it should
be ignored. This might give an approximation to what you want.
But as with your attribute example, it might confuse software
that was not aware of the trick being used.




> 
> I do not believe that functionality exists, but I am not certain. I've CCd 
> Ward
> and Dennis from the netCDF-C team to see if they know of any such 
> functionality.
> 
> Cheers,
> 
> Sean
> 
> >
> > I was directed to you by Bob Simons at NOAA.  I work for NOAA CoastWatch on 
> > satellite data processing systems and visualization software, and have 
> > recently been looking at how to speed up our processing of HDF and NetCDF 
> > data files.  Some of the routines process a data file chunk by chunk in 
> > parallel, and could greatly benefit from the ability to ignore chunks in an 
> > input file if they contain all fill values (ie: when forming a time series 
> > composite of satellite images, or computing a mathematical expression on 
> > images).  This is especially the case when the chunks are compressed and 
> > the majority of the chunks in the file are all fill values, and there’s no 
> > need to read them at all, thereby bypassing the extra memory allocation and 
> > the processing to determine that all data values are fill values.
> >
> > I’ve looked at the HDF 4 user’s guide and _not_ found a routine that 
> > “previews” a given compressed chunk to check for an “all fill” condition, 
> > but I’ve not yet looked in the NetCDF 4 / HDF 5 documentation for such a 
> > routine.  I thought I would ask for help first, since I’d like that 
> > functionality for both HDF 4 and NetCDF 4.  I’ve also considered writing a 
> > special attribute to each variable on initial file creation that lists the 
> > valid chunks, but I’m wary that data processing that isn’t aware of that 
> > special attribute and updates the variable data could leave the attribute 
> > inconsistent with the data itself.  I believe that the I/O libraries 
> > themselves must keep tabs on chunks that consist of all fill values upon 
> > writing (ie: for example, chunks that were never written to), but I don’t 
> > know if that information is saved or exposed to the user upon subsequent 
> > reading.
> >
> >

=Dennis Heimbigner
  Unidata


Ticket Details
===================
Ticket ID: UNZ-282706
Department: Support netCDF
Priority: Low
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.