[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #UAU-670796]: Rechunking of a huge NetCDF file

Hi Henri,

> >> The â-râ flag didnât make a difference for a small test file, but 
> >> Iâll have to try it with a bigger one.
> >
> > I never found a case in which the "-r" flag saved time, but was hoping
> > it might work for your extreme rechunking case.
> I would actually expect it to be very useful. It also seems that I donât 
> have that flag on all platforms, maybe it depends on some flags during 
> compilation?

It just depends on the version of netCDF from which you built nccopy.
It must be netCDF C version 4.2.1 (June 2012) or later to have nccopy 
support for diskless access with "-w" and "-r".

> > Thanks, I downloaded the file and see the problem.  In small.nc, you
> > have lon a dimension of size 3, but you are specifying rechunking
> > along that dimension with lon/4, specifying a chunk size larger than
> > the fixed dimension size.  That's apparently something we should check
> > in nccopy.
> >
> > As a workaround, if you change this to lon/3, the nccopy completes
> > without error.  This should be an easy fix, which will be in the next
> > release.
> Ok, that works. It seems then that I have misunderstood chunking. Like if 
> itâs directly dependent on the lengths of dimensions, what does it then 
> mean to chunk unevenly (like lon/2 in this case)? (No need to explain if 
> itâs just a technical detail.)

Chunking unevenly works fine, and just results in some remainder in the 
last chunk that has no data in it (yet).

> > Small chunks cause lots of overhead in HDF5, but I'm not sure whether
> > that's the problem.  I'll have to look at this more closely and
> > respond when I've had a chance to see what's going on.
> I did make some interesting observations. I had previously overlooked the 
> â-uâ flag (itâs documentation is somewhat confusingâ?). The time 
> coordinate has been unlimited in my files. On my Macbook Air:
> nccopy -w -c time/99351,lat/1,lon/1 small.nc test1.nc  11.59s user 0.07s 
> system 99% cpu 11.723 total
> nccopy -u small.nc small_u.nc
> nccopy -w -c time/99351,lon/1,lat/1 small_u.nc test2.nc  0.07s user 0.04s 
> system 84% cpu 0.127 total
> Thatâs amazing!

It's because we use the same default chunk length of 1 as HDF5
does for unlimited dimensions.  But when you use -u, it makes 
all dimensions fixed, and then the default chunk length is larger.

> However, when I ran a similar test with a bigger (11GB) subset of my actual 
> data, this time on a cluster (under SLURM), there was no difference between 
> the two files. Maybe my small.nc is simply too small to reveal actual 
> differences and everything is hidden behind overheads?

That's possible, but you also need to take cache effects into account.
Sometimes when you run a timing test, a small file is read into memory
buffers, and subsequent timings are faster becasue the data is just
read from memory instead of disk, and similarly for writing.  With 11GB
files, you might not see any in-memory caching, because the system disk
caches aren't large enough to hold the file, or even consecutive chunks
of a variable.

> Anyways, I was able to rechunk the bigger test file to 
> time/10000,lon/10,lat/10 in 5 hours, which is still quite long but doable if 
> I go variable at a time. And you were right: reading in data chunked like 
> this is definitely fast enough. Maybe I will still try with bigger lengths 
> for lon/lat to see if I can do this in less than an hour.

That's good to hear, thanks for reporting back.  I imagine if we looked
carefully at where the time is being spent, that 5 hour rechunking could 
be reduced significantly, but it might require a smarter nccopy.


Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu

Ticket Details
Ticket ID: UAU-670796
Department: Support netCDF
Priority: High
Status: Closed

NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.