[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #HCQ-327445]: Saving time of big arrays with different loops on dimensions



It is hard to figure out why c and d are so much slower without
seeing at least some form of pseudo-code of exactly what is being
performed.

Let me take a shot in the dark and suggest that you use the
nc_sync function to force data to be pushed to the file. This may
change the memory usage patterns.

Also, if this is a netcdf-4 file, then changing the chunking parameters
may have an effect.

> Full Name: Tobias Zolles
> Email Address: address@hidden
> Organization: University Bergen
> Package Version:
> Operating System:
> Hardware:
> Description of problem:
> 
> I have a net cdf file with multiple variables which have 4 dimensions
> x,y,z and time. The model is calculating each horizontal grid point over a
> whole year. So it calculates first a z*time matrix. I would like to save
> this for each grid cell in a net-cdf file instead of first creating a
> whole x*y*z*time array which i save once. But i realize apparently that
> this is three orders of magnitude slower than the old approach of just
> creating the big array and saving it as once. This won't be applicable
> due to memory constraints in the future.
> 
> I did some testing now and created a x*y*z*time array and then saved
> subsets of the data
> 
> a) via a loop over time and save 365 times the x*y*z array
> b) loop over time and z to save the x*y array
> c) a loop over all horizontal grid points saving a time*z
> array (analog to what we really want)
> d) for all points *x,y,z (so loop over those) saving a vector
> with the time dimension.
> 
> a and b are fast and c and d are about 3-4 orders of magnitude slower.
> 
> What can be the case for it? I found some information
> (https://www.unidata.ucar.edu/support/help/MailArchives/netcdf/msg10905.html)
> which didn't help. I removed the NF_unlimited statement of my time
> dimension/variable, it is now fixed length of 365. But it didn't change
> anything performance wise.  It feels like a certain hierarchy over
> my variables.
> 
> I attached the fortran code of the subroutine initializing and writing
> the net_cdfs for simplicity I removed the "normal non-dimension" variables
> and a pseudo code of my writing function.
> 
> I am using net-cdf-4. Use of unlimited time dimension could be beneficial,
> but at the current state with daily data it is not necessary.
> 
> 

=Dennis Heimbigner
  Unidata


Ticket Details
===================
Ticket ID: HCQ-327445
Department: Support netCDF
Priority: Normal
Status: Open
===================
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata 
inquiry tracking system and then made publicly available through the web.  If 
you do not want to have your interactions made available in this way, you must 
let us know in each email you send to us.