Due to the current gap in continued funding from the U.S. National Science Foundation (NSF), the NSF Unidata Program Center has temporarily paused most operations. See NSF Unidata Pause in Most Operations for details.
Hi John >> Yes we extract 10-year plus time series of hourly hindcast data from >> six-hourly model run files, either one point or more usually a >> handful of them. These are then processed for climatology statistics, >> to look at specific events, or supplied to a user. Converting them to >> e.g. monthly files does improve extraction time somewhat, but not >> nearly as much as permuting them so time is the inner dimension. To >> lowest order, permuting reduces extraction time by >> 1/(length_of_time_dimension). >> > > If im thinking about this correctly, this will be true when you only want 1 > point, since each time point will cost you a disk read. > > If you want a "handful", for example 6 points along the lon (the inner > dimension (time, lat, lon) for current implementation), then im thinking you > would get the same performance. Yes I think you're right for this dataset in its current form, but an inner aggregation would open up the possibility of dramatically increasing performance by concatenating the files together into say daily, weekly, or monthly files. I've tried this with time as the outer dimension and increases performance a bit, I think because it reduces the time it takes to build the FMRC and the overhead of opening extra files. But if disk seek time is around 5ms and read time for a double around 0.002ms, then if all the data are contiguous on disk you can read in thousands of data points in the time it takes to do a single seek. I'm anything but an expert on IO so I don't know how far this approach would scale, but we've tested it up to daily files of hourly data (for a different dataset, accessing the individual files in a loop using Java-netCDF) and did get very close to a factor of 24 speedup. >We are working on an experimental "ncstream" protocol that allows a writer to >write data in any >order, and the reader rearranges as needed, but its not >ready for use yet. Do you know when this might be available? I need to decide whether it's worth writing something bespoke for this... and if there's a solution in the pipeline in the libraries it's probably not. > BTW, who is consuming the output? An internal process that you control, or > ??? They're used by an analyst. Most often some statistical analysis is done and the results used to create a report for an external user at their site. Thanks for your help, I'd got about as far as I could from looking at the APIs. kind regards, Martin -- Martin Price Ocean Forecasting Research and Development Met Office, FitzRoy Road, Exeter, EX1 3PB, United Kingdom Tel: +44 (0)1392 886982 Fax: +44 (0)1392 885681 email: mpricemetoffice at googlemail dot com http://www.metoffice.gov.uk
netcdf-java
archives: