Unidata - To provide the data services, tools, and cyberinfrastructure leadership that advance Earth system science, enhance educational opportunities, and broaden participation. Unidata
         
  advanced  
 

Re: Performance problem with large files

Martin Dix wrote:
> 
> hinsen@xxxxxxxxxxxxxxxxxxxxx writes:
> 
>  > ... The data in the files is essentially
>  > one single-precision float array of dimensions 8000 x 3 x 16000, the
>  > last dimension being declared as "unlimited". I read and write
>  > subarrays of shape 1 x 3 x 16000. ...
> 
> For simplicity call the unlimited dimension t. A netcdf file stores
> all the data for t=1, then for t=2 etc. Your description of the
> array indices means that each subarray is scattered through the
> entire file and requires accessing almost every file block. Things
> should be a lot better if you write subarrays of 8000 x 3 x 1 or if
> you can't do this, rearrange the file so that the 8000 dimension is
> unlimited rather than the 16000 dimension.
> 

Every time you write data with unlimited dimensions the data isn't block
written.

e.g.
DATA1 1,2,3,4,5
DATA2 10,20,30,40,50

result in file 

1,10,2,20,3,30,4,50


If you now read in DATA1 the whole file must be read.

In some cases this is much slower by reading instead of using limited
dimensions.

If you are using limited dimensions the result in file is
1,2,3,4,5,10,20,30,40,50


Then by read only small amounts of the file must be read.

regards
Reimar






> Martin Dix
> 
> CSIRO Atmospheric Research                Phone: +61 3 9239 4533
> Private Bag No. 1, Aspendale                Fax: +61 3 9239 4444
> Victoria 3195  Australia                  Email: martin.dix@xxxxxxxxxxxx

-- 
Reimar Bauer 

Institut fuer Stratosphaerische Chemie (ICG-1)
Forschungszentrum Juelich
email: R.Bauer@xxxxxxxxxxxxx
http://www.fz-juelich.de/icg/icg1/
=================================================================
a IDL library at ForschungsZentrum Juelich
http://www.fz-juelich.de/icg/icg1/idl_icglib/idl_lib_intro.html

http://www.fz-juelich.de/zb/text/publikation/juel3786.html
=================================================================

read something about linux / windows
http://www.suse.de/de/news/hotnews/MS.html

 
 
  Contact Us     Site Map     Search     Terms and Conditions     Privacy Policy     Participation Policy
 
National Science Foundation (NSF) UCAR Community Programs   Unidata is a member of the UCAR Community Programs, is managed by the University Corporation for Atmospheric Research, and is sponsored by the National Science Foundation.
P.O. Box 3000     Boulder, CO 80307-3000 USA     Tel: 303-497-8643     Fax: 303-497-8690