[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Support #CUV-251255]: Nccopy extremly slow / hangs



Hi Mark,

> Organization: DTU Aqua
> Package Version: 4.1.2
> Operating System: Mandriva
> Hardware: 4 Core, 8GB RAM
> 
> Hi,
> 
> I have netcdf 4 file with the following structure
> 
> dimensions:
> latitude = 1617 ;
> longitude = 1596 ;
> time = UNLIMITED ; // (1698 currently)
> variables:
> float chl_oc5(time, latitude, longitude) ;
> <....snip...>
> chl_oc5:_ChunkSizes = 1, 1031, 1017 ;
> chl_oc5:_DeflateLevel = 6 ;
> chl_oc5:_Shuffle = "true" ;
> 
> I am mainly interested in accessing the data along the time dimension ie
> a time series at a given lon, lat pixel. In the current configuration,
> extracting a single time series is extremely slow, so I would like
> to rearrange the internal chunking so that the file is optimised for
> reading in this dimension ie chl_oc5:_Chucksizes=1698, 1, 1 ; or something
> similar. I am trying to use nccopy to do this, as follows:
> 
> nccopy -u -k3 -d1 -m4g -c time/1698,longitude/6,latitude/7 foo.nc bar.nc
> 
> Nccopy runs, creates the file bar.nc, and I can see the size of it
> increase, rapidly at first, as data is filled in. However, the rate of
> filling slows continuously, and ultimately it stalls and the file size
> remains static. Nccopy, however, is still present in  memory, and still
> working (100% CPU, 41% MEM) - it is also still in ownership of the file -
> the time stamp is updated continuously. This remains the case if I leave
> it to run overnight - foo.nc is about 600MB, and after nearly 24 hours
> run, bar.nc is still only 200MB. The level at which the filesize plateaus
> is a function of the chunking (and maybe compression?) settings that I
> choose, but I'm not sure what the relationship is. Most other operations
> on this file are complete in 15 mins (e.g. nccopy -u, nccopy -d0 etc)
> 
> Do you have any suggestions as to what is going wrong.

This sounds like a bug that we'd like to duplicate and fix.  It may have
something to do with a need to set chunk cache sizes appropriately for
reshaping the data with nccopy.  Could you please either provide access
to the original 500 MB file so we can reproduce the problem, or alternatively 
to a complete output of "ncdump -c foo.nc" so we can
create a test file that will accurately reproduce the performance bug
you have reported.

In the mean time, I'll study how setting chunk cache sizes from something
other than the defaults might help, and report back.  If it's something
nccopy can do automatically, then I'll fix the code to do that.

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: CUV-251255
Department: Support netCDF
Priority: Normal
Status: Closed