Re: [thredds] THREDDS Cache

Hi,

I've thrown together a preliminary web page providing links to some
documents with guidance for netCDF-4 compression and chunking:

  http://www.unidata.ucar.edu/software/netcdf/docs/compression.html

In addition to NCO's ncks, you can also use the nccopy utility that
comes with netCDF-4.1.2 or later to try out various compression and
chunking schemes.  For example, to convert netCDF-3 data to netCDF-4
data compressed at deflation level 1 and using 10x20x30 chunks for
variables that use (time,lon,lat) dimensions:

  nccopy -d1 -c time/10,lon/20,lat/30 netcdf3.nc netcdf4.nc

As an example of the kind of performance differences seen in accessing a
data stored contiguously versus with chunking, here's what we saw in one
benchmark, accessing all the data in a 3D float array of about 81
million values using cross sections in different orientations:

 432 x 432 x 432 array of floats with chunk sizes of 36 x 36 x 36

        Access                Contiguous  Chunking    Slowdown
                              (seconds)   (seconds)  or speedup

 2D x,y cross-section write      0.559      1.97    3.5 x slower
 2D x,z cross-section write     18.1        1.5      12 x faster
 2D y,z cross-section write    223          9.55     23 x faster
 2D x,y cross-section read       0.353      1.06      3 x slower
 2D x,z cross-section read       6.22       1.45    4.3 x faster
 2D y,z cross-section read      77.1        7.68     10 x faster

The moral is that with chunking, accesses that are already fast may slow
down a little while accesses that are very slow speed up a lot ...

--Russ

> Oops.  Typing too fast:  I meant
> "  If they want NetCDF3 files they can use the NetCDF subset service
> or use tools like NCO that can read opendap and
>    generate NetCDF3..."
> 
> On Mon, May 2, 2011 at 4:57 PM, Rich Signell <rsignell@xxxxxxxx> wrote:
> > Jerry,
> >
> > You might also check into using NetCDF4 files with deflation instead
> > of .nc.gz. =A0Your users can still download as opendap or any of the
> > other services. =A0If they want netcdf4 files they can use the NetCDF
> > subset service or use tools like NCO that can read opendap and
> > generate NetCDF 3 files. =A0 You can convert NetCDF3 to NetCDF4 using
> > level 1 deflation using NCO =A0( =A0ncks -4 -L 1 netcdf3.nc =A0 netcdf4.n=
> c).
> > =A0They should be about the same size as the nc.gz files, and will be
> > much faster to read since you don't have to uncompress the whole file.
> >
> > -Rich
> >
> >
> > On Mon, May 2, 2011 at 2:27 PM, jerry y pan <jerry.ypan@xxxxxxxxx> wrote:
> >> Hi John,
> >> Our TDS (4.2) uses some compressed netcdf files (*.nc.gz) and it works f=
> ine,
> >> except that the very first access to them were slow (relatively large fi=
> les,
> >> about 400 MB each). The subsequent accesses would be much faster, but it
> >> would become slow again after a while of non-activity. I can see that TDS
> >> uncompress these files to the temp data location, my question is that if=
>  TDS
> >> cleans up these temp files, which leads to the work to decompress them n=
> ext
> >> time and hence the subsequent slowness? If so, is there a way to keep the
> >> cache there permanently? Or, perhaps the faster response right after the
> >> first access is due to in memory cache? Any configuration I could twist =
> the
> >> cache?
> >>
> >>
> >> Thanks,
> >> -Jerry Pan
> >>
> >> _______________________________________________
> >> thredds mailing list
> >> thredds@xxxxxxxxxxxxxxxx
> >> For list information or to unsubscribe, =A0visit:
> >> http://www.unidata.ucar.edu/mailing_lists/
> >>
> >
> >
> >
> > --
> > Dr. Richard P. Signell=A0=A0 (508) 457-2229
> > USGS, 384 Woods Hole Rd.
> > Woods Hole, MA 02543-1598
> >
> 
> 
> 
> -- =
> 
> Dr. Richard P. Signell=A0=A0 (508) 457-2229
> USGS, 384 Woods Hole Rd.
> Woods Hole, MA 02543-1598
> 
> _______________________________________________
> thredds mailing list
> thredds@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit: http://www.unidata.ucar.edu=
> /mailing_lists/=20