[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #UAU-670796]: Rechunking of a huge NetCDF file

> Hi Russ,
> >> I did make some interesting observations. I had previously overlooked the 
> >> â-uâ flag (itâs documentation is somewhat confusingâ?). The time 
> >> coordinate has been unlimited in my files. On my Macbook Air:
> >>
> >> nccopy -w -c time/99351,lat/1,lon/1 small.nc test1.nc  11.59s user 0.07s 
> >> system 99% cpu 11.723 total
> >>
> >> nccopy -u small.nc small_u.nc
> >>
> >> nccopy -w -c time/99351,lon/1,lat/1 small_u.nc test2.nc  0.07s user 0.04s 
> >> system 84% cpu 0.127 total
> >>
> >> Thatâs amazing!
> >
> > It's because we use the same default chunk length of 1 as HDF5
> > does for unlimited dimensions.  But when you use -u, it makes
> > all dimensions fixed, and then the default chunk length is larger.
> But both small.nc and small_u.nc are classic netCDF files. So no HDF5 
> relations at allâ?

Oops, you're right, it has nothing to do with HDF5, but everything to do 
with the format for record variables in netCDF classic format files:


Accessing all the data from a record variable along the unlimited 
dimension can require one disk access per value, whereas using the
contiguous storage of fixed-size variables accesses data very
efficiently.  On the other hand, if you need to access data from 
multiple variables one record-at-a-time, record variables can be
the best layout for data.

> >> However, when I ran a similar test with a bigger (11GB) subset of my 
> >> actual data, this time on a cluster (under SLURM), there was no difference 
> >> between the two files. Maybe my small.nc is simply too small to reveal 
> >> actual differences and everything is hidden behind overheads?
> >
> > That's possible, but you also need to take cache effects into account.
> > Sometimes when you run a timing test, a small file is read into memory
> > buffers, and subsequent timings are faster becasue the data is just
> > read from memory instead of disk, and similarly for writing.  With 11GB
> > files, you might not see any in-memory caching, because the system disk
> > caches aren't large enough to hold the file, or even consecutive chunks
> > of a variable.
> My non-python timings were naively from the âtimeâ command, which 
> performs the command just once. So I donât think there can be any cache 
> effects here.

Hmmm, not sure how to explain that.

> Iâm not sure what I did differently previously with the 11GB test file 
> (maybe a cluster with hundreds of users is not the best for performance 
> comparison). Anyways, I do think that the -u flag solved my problem. I got 
> fed up with queuing for resources on the cluster and decided to go with a 
> normal desktop machine with 16GB of memory. So I stripped a single variable 
> from the huge file and did the -u operation on the resulting 43GB file, and 
> then run this:
> nccopy -m 10G -c time/10000,lat/10,lon/10 shortwave_u.nc shortwave_u_T10.nc
> It took only 15 minutes! Without the -u operation the command processed only 
> a few GB in 1 hour (after which I cancelled it).
> 2/5 variables done now. If no further technical problems arise, I should have 
> the data ready for their actual purpose tomorrow. :)

Excellent, and your experience may be useful to other users.  I'll add use
of "-u" to my future performance advice.

> Thank you for your help! I will acknowledge you/Unidata in my paper (any 
> preference?).

Feel free to acknowledge me.  Thanks!


> - Henri
Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu

Ticket Details
Ticket ID: UAU-670796
Department: Support netCDF
Priority: High
Status: Closed

NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.