Due to the current gap in continued funding from the U.S. National Science Foundation (NSF), the NSF Unidata Program Center has temporarily paused most operations. See NSF Unidata Pause in Most Operations for details.
Netcdf-4 format is an order magnitude more complicated, with chunking and compression and non-deterministic (perhaps order-dependent is a better term) data placement. The most useful optimisation is to try to make the commonly wanted subset fit inside of a single (or small number of) chunks.
Jon, have you profiled your code and are sure that disk reading is the bottleneck?
On 7/15/2010 11:39 AM, Joe Sirott wrote:
Hi Jon,Benchmarks like these can be quite tricky, due to the interaction of the application with the OS. Unless you purge the OS page cache each time you run your benchmark, your application (after the first test) isn't reading data from disk but is instead copying data from the disk page cache into local buffers, and the benchmark will likely be CPU bound and execution time will be dominated by type conversion from raw buffered data arrays into Java types. That would account for the strange results you are seeing when reading 4K rather than 8K data chunks.Also, for more info on netcdf-4 chunking/compression, Unidata has a nice introduction at http://hdfeos.org/workshops/ws13/presentations/day1/HDF5-EOSXIII-Advanced-Chunking.pptCheers, Joe Jon Blower wrote:Hi John, Thanks for this.netcdf-3 IOSP uses a bufferred RandomAccessFile implementation,default8096 byte buffer, which always reads 8096 bytes at a time. the only useful optimisation is to change the buffer size.Good to know, thanks. I would have thought that this would mean that there's no point reading data of less than 8096 bytes. But in my tests I see that even below this value there's a linear relationship between the size of data being read and the time to read the data (i.e. it's quicker to read 4K than 8K). I don't quite understand this. Are there any specs for the NetCDF-4 format that I could read? I'd like to know more about how the data are compressed, and how much data actually need to be read from disk to get a subset. Cheers, Jon -----Original Message----- From:netcdf-java-bounces@xxxxxxxxxxxxxxxx [mailto:netcdf-java-bounces@xxxxxxxxxxxxxxxx] On Behalf Of John Caron Sent: 15 July 2010 00:26 To:netcdf-java@xxxxxxxxxxxxxxxx Subject: Re: [netcdf-java] Reading contiguous data in NetCDF files Hi Jon: On 7/14/2010 2:51 PM, Jon Blower wrote:Hi, I don't know anything about how data in NetCDF files are organized,butintuitively, I would think that, for a general 2D array, the data at points [j,i] and [j,i+1] would be contiguous on disk. Is this right? (i is the fastest-varying dimension)yes, for variables in netcdf-3 filesI might also suppose that, for an array of size [nj,ni], that the data at points [j,ni-1] and [j+1,0] would also be contiguous. Is thistrue?yes, for variables in netcdf-3 files that dont use the unlimited dimensionIf so, is there a method in Java-NetCDF that would allow me to read these two points (and only these two points) in a single operation?netcdf-3 IOSP uses a bufferred RandomAccessFile implementation, default 8096 byte buffer, which always reads 8096 bytes at a time. the only useful optimisation is to change the buffer size.(Background: I'm trying to improve the performance of ncWMS by optimising how data is read from disk. This seems to involve strikingabalance between the number of individual read operations and the sizeofeach read operation.) Thanks, Jon -- Dr Jon Blower Technical Director, Reading e-Science Centre Environmental Systems Science Centre University of Reading Harry Pitt Building, 3 Earley Gate Reading RG6 6AL. UK Tel: +44 (0)118 378 5213 Fax: +44 (0)118 378 6413 j.d.blower@xxxxxxxxxxxxx http://www.nerc-essc.ac.uk/People/Staff/Blower_J.htm _______________________________________________ netcdf-java mailing list netcdf-java@xxxxxxxxxxxxxxxx For list information or to unsubscribe, visit:http://www.unidata.ucar.edu/mailing_lists/_______________________________________________ netcdf-java mailing list netcdf-java@xxxxxxxxxxxxxxxx For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/ _______________________________________________ netcdf-java mailing list netcdf-java@xxxxxxxxxxxxxxxx For list information or to unsubscribe, visit:http://www.unidata.ucar.edu/mailing_lists/
netcdf-java
archives: