[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #YYU-156317]: list index / stride reads very slow



This is a known problem in some older versions of netcdf.
The speed was improved starting with version 4.6.2.
Please upgrade to that version and see if it gives you
adequate performance.

> Package Version: netcdf-4.3.3.1-5.el7.x86_64
> Operating System: centos 7
> Hardware: VM
> Description of problem: The following python program is very slow using 
> netcdf-4.3.3.1 and ok with netcdf-4.1.1-3
> 
> import numpy as np
> import datetime
> from netCDF4 import Dataset
> 
> nc4file     = Dataset('/net/satarch/Sjaak/test.hdf5','r')
> data = nc4file.variables["data"]
> 
> data.shape
> st = datetime.datetime.now()
> print np.mean(data[0])
> print datetime.datetime.now() - st
> st = datetime.datetime.now()
> print np.mean(data[3])
> print datetime.datetime.now() - st
> st = datetime.datetime.now()
> print np.mean(data[[0,3]])
> print datetime.datetime.now() - st
> 
> netcdf-4.1.1-3 (centos 6.7)
> ===============
> >>> import numpy as np
> >>> import datetime
> >>> from netCDF4 import Dataset
> >>> nc4file     = 
> >>> Dataset('/net/satarch/CommonSense/DataCubes/Test/MODIS500/base-NDVI-out2.hdf5','r')
> >>> data = nc4file.variables["data"]
> >>> data.shape
> (744, 4000, 4000)
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[0])
> 499999.473664
> >>> print datetime.datetime.now() - st
> 0:00:00.583850
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[3])
> 3499999.36307
> >>> print datetime.datetime.now() - st
> 0:00:00.590855
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[[0,3]])
> 2000000.5161
> >>> print datetime.datetime.now() - st
> 0:00:02.076450
> 
> netcdf-4.3.3.1 (centos 7)
> ===============
> >>> import numpy as np
> >>> import datetime
> >>> from netCDF4 import Dataset
> >>> nc4file     = Dataset('/net/satarch/Sjaak/test.hdf5','r')
> >>> data = nc4file.variables["data"]
> >>> data.shape
> (744, 4000, 4000)
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[0])
> 499999.473664
> >>> print datetime.datetime.now() - st
> 0:00:00.415814
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[3])
> 3499999.36307
> >>> print datetime.datetime.now() - st
> 0:00:00.401048
> >>> st = datetime.datetime.now()
> >>> print np.mean(data[[0,3]])
> 
> (stopped after waiting for more than 2 minutes)
> 
> ncdump
> =======
> ncdump  -sh /net/satarch/Sjaak/test.hdf5
> netcdf test {
> dimensions:
> time = 744 ;
> latitude = 4000 ;
> longitude = 4000 ;
> variables:
> int time(time) ;
> time:_Storage = "contiguous" ;
> time:_Endianness = "little" ;
> int latitude(latitude) ;
> latitude:_Storage = "contiguous" ;
> latitude:_Endianness = "little" ;
> int longitude(longitude) ;
> longitude:_Storage = "contiguous" ;
> longitude:_Endianness = "little" ;
> float data(time, latitude, longitude) ;
> data:_Storage = "chunked" ;
> data:_ChunkSizes = 1, 200, 200 ;
> 
> // global attributes:
> :_Format = "netCDF-4" ;
> }
> 
> 
> 
> 

=Dennis Heimbigner
  Unidata


Ticket Details
===================
Ticket ID: YYU-156317
Department: Support netCDF
Priority: Critical
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.