Re: reading raw (packed) data from NetCDF files and avoiding missing-value check

To: "John Caron" <caron@xxxxxxxxxxxxxxxx>
Subject: Re: reading raw (packed) data from NetCDF files and avoiding missing-value check
From: "Jon Blower" <jdb@xxxxxxxxxxxxxxxxxxxx>
Date: Fri, 27 Oct 2006 12:22:07 +0100

Hi John (cc list),

Thanks for you help - I found a solution that works well in my app.
As you suggested, I open the dataset without enhancement, then added
the coordinate systems:

           nc = NetcdfDataset.openDataset(location, false, null);
           // Add the coordinate systems
           CoordSysBuilder.addCoordinateSystems(nc, null);
           GridDataset gd = new GridDataset(nc);
           GeoGrid geogrid = gd.findGridByName(varID);

I then create an EnhanceScaleMissingImpl:

           EnhanceScaleMissingImpl enhanced = new
EnhanceScaleMissingImpl((VariableDS)geogrid.getVariable());

(Unfortunately this class is package-private so I made a copy from the
source code in my local directory.  Could this class be made public
please?)

This means that when I read data using geogrid.subset() it does not
check for missing values or unpack the data and is therefore quicker.
I then do enhanced.convertScaleOffsetMissing() only on the individual
values I need to work with.  Seems to work well and is pretty quick.
Is there anything dangerous in the above?

Thanks again,
Jon


On 26/10/06, John Caron <caron@xxxxxxxxxxxxxxxx> wrote:

Hi Jon:

Jon Blower wrote:
> Hi John,
>
> I need some of the functionality of a GridDataset to allow me to read
> coordinate system information.  Also, I might be opening an NcML
> aggregation.  Is it sensible to use NetcdfDataset.getReferencedFile()?
> In the case of an NcML aggregation, is it possible to get a handle to
> a specific NetcdfFile (given relevant information such as the
> timestep)?

You are getting into the internals, so its a bit dangerous.

I think this will work:

 NetcdfDataset ncd = openDataset(String location, false, null); // dont enhance
 ucar.nc2.dataset.CoordSysBuilder.addCoordinateSystems(ncd, null); // add coord 
info
 GridDataset gds = new GridDataset( ncd); // make into a grid

BTW, you will want to switch to the new GridDataset in ucar.nc2.dt.grid when 
you start using 2.2.17. It should be compatible, let me know.


>
> On a related note, is it efficient to read data from a NetcdfFile (or
> NetcdfDataset) point-by-point?  I have been assuming that reading
> contiguous chunks of data is more efficient than reading individual
> points, even if it means reading more data than I actually need, but
> perhaps this is not the case?  Unfortunately I'm not at my usual
> computer so I can't do a quick check myself.  If reading data
> point-by-point is efficient (enough) my problem goes away.

It depends on data locality. If the points are close together on disk, then 
they will likely to be already in the random access file buffer. The bigger the 
buffer the more likely, you can try different buffer sizes with:

NetcdfDataset openDataset(String location, boolean enhance, int buffer_size, 
ucar.nc2.util.CancelTask cancelTask, Object spiObject);



>
> Thanks, Jon
>
> On 26/10/06, John Caron <caron@xxxxxxxxxxxxxxxx> wrote:
>
>> Hi Jon:
>>
>> One obvious thing would be to open it as a NetcdfFile, not a
>> GridDataset. Is that a possibility?
>>
>> Jon Blower wrote:
>> > Hi,
>> >
>> > I'm writing an application that reads data from NetCDF files and
>> > produces images.  I've noticed (through profiling) that a slow point
>> > in the data reading process is the unpacking of packed data (i.e.
>> > applying scale and offset) and checking for missing values.  I would
>> > like to minimize the use of these calls.
>> >
>> > To cut a long post short, I would like to find a low-level function
>> > that allows me to read the packed data, exactly as they appear in the
>> > file.  I can then "manually" apply the unpacking and missing-value
>> > checks only to those data points that I need to display.
>> >
>> > I'm using nj22, version 2.2.16.  I've tried reading data from
>> > GeoGrid.subset() but this (of course) performs the unpacking.  I then
>> > tried getting the "unenhanced" variable object through
>> > GeoGrid.getVariable().getOriginalVariable(), but (unexpectedly) this
>> > also seems to perform unpacking and missing-value checks - I expected
>> > it to give raw data.
>> >
>> > Can anyone help me to find a function for reading raw (packed) data
>> > without performing missing-value checks?
>> >
>> > Thanks in advance,
>> > Jon
>> >
>>
>> 
==============================================================================
>>
>> To unsubscribe netcdf-java, visit:
>> http://www.unidata.ucar.edu/mailing-list-delete-form.html
>> 
==============================================================================
>>
>>
>>
>
>



--
--------------------------------------------------------------
Dr Jon Blower              Tel: +44 118 378 5213 (direct line)
Technical Director         Tel: +44 118 378 8741 (ESSC)
Reading e-Science Centre   Fax: +44 118 378 6413
ESSC                       Email: jdb@xxxxxxxxxxxxxxxxxxxx
University of Reading
3 Earley Gate
Reading RG6 6AL, UK
--------------------------------------------------------------

==============================================================================
To unsubscribe netcdf-java, visit:
http://www.unidata.ucar.edu/mailing-list-delete-form.html
==============================================================================

Follow-Ups:
- Re: reading raw (packed) data from NetCDF files and avoiding missing-value check
  - From: Don Murray

References:
- reading raw (packed) data from NetCDF files and avoiding missing-value check
  - From: Jon Blower
- Re: reading raw (packed) data from NetCDF files and avoiding missing-value check
  - From: John Caron
- Re: reading raw (packed) data from NetCDF files and avoiding missing-value check
  - From: John Caron

2006 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-java archives: