[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #CPE-163785]: file size does not change



Jain,

> Thank you so much,
> It's a classic file,
> by ncdump -h -s, I didn't see the chunking infor,
> do I have to specify the chunking size and shape before write a large file?
> and how to do that?

Classic format files don't support chunking, or compression for which chunking 
is
required.  You can read about the differences between netCDF formats in this 
FAQ:

  How many netCDF formats are there, and what are the differences among them?
  http://www.unidata.ucar.edu/netcdf/docs/faq.html#formats-and-versions

> I create a new file and reproduce the data, it's ok now.
> here is the infor I get:

OK, that only looks like about 112 GBytes to me, from 4bytes
per float value and 14000*100*100*100 values for each of 2
variables.

Did you actually write all the data one time slice at a time, watching
the size of the file grow, or did you try writing a slice or all of
one of the variables and then watch the size of the file not grow as
you wrote the other variable.  The structure of the netCDF classic file
would be 14000 "records", each containing the 1*100*100*100 values for
one time of each of the variables, and each time you write any values
for a record, all the space for that record is allocated on disk, so
you wouldn't see any growth writing the second variable, if you had
already written the first variable.

For pictures of the structure, see this from the online workshop:

  
http://www.unidata.ucar.edu/netcdf/workshops/2012/performance/ClassicParts.html

--Russ
is
> /netcdf/bin$ ./ncdump -h -s /lustre/scratch/sds/ca1.nc
> netcdf ca1 {
> dimensions:
> level = 100 ;
> latitude = 100 ;
> longitude = 100 ;
> time = UNLIMITED ; // (14000 currently)
> variables:
> float latitude(latitude) ;
> latitude:units = "degrees_north" ;
> float longitude(longitude) ;
> longitude:units = "degrees_east" ;
> float pressure(time, level, latitude, longitude) ;
> pressure:units = "hPa" ;
> float temperature(time, level, latitude, longitude) ;
> temperature:units = "celsius" ;
> 
> // global attributes:
> :_Format = "classic" ;
> }
> 
> Thanks,
> 
> Jailin
> 
> 
> address@hidden> wrote:
> 
> > Hi Jailin,
> >
> > > I created a large 4D netcdf file, and append the data along the time
> > > dimension (unlimited),
> > > but the file size information (by a ls -al -h) doesn't change as more
> > data
> > > are appended.
> > > it shows 155G.
> > > Anyone knows the reason?
> >
> > Is this a netCDF-4 file or a netCDF-3 classic format file?  To determine
> > the format,
> > just look at the output from running
> >
> >   $ ncdump -k filename.nc
> >
> > If it's a netCDF-4 file (or a netCDF-4 classic model file), then using an
> > unlimited dimension requires chunking, and it's possible to specify chunk
> > shapes so that only one chunk is appended to the file for a large amount of
> > data appended along the time dimension, as the chunk fills in.  To see
> > whether
> > this is the case. it would be useful to see the chunk sizes and shapes, by
> > providing the output from running
> >
> >   $ ncdump -h -s filename.c
> >
> > and looking at the "_ChunkSizes" attributes.  _ChunkSizes is a list of
> > chunk
> > sizes for each dimension of a variable.
> >
> > If it's a netCDF-3 file, it's possible to write the data values out of
> > order,
> > writing data for a large value of the time dimension first, and then
> > appending
> > values along the time dimension starting with earlier times, in which case
> > the
> > file size would start out large, but just fill in earlier records as
> > earlier
> > times are written.
> >
> > Do you have a small program that demonstrates the behavior?  It would be
> > easier
> > to reproduce if you could demonstrate it with a file smaller than 155
> > GBytes :-).
> >
> > --Russ
> >
> >
> >
> > Russ Rew                                         UCAR Unidata Program
> > address@hidden                      http://www.unidata.ucar.edu
> >
> >
> >
> > Ticket Details
> > ===================
> > Ticket ID: CPE-163785
> > Department: Support netCDF
> > Priority: Normal
> > Status: Closed
> >
> >
> 
> 
> --
> 
> Genius only means hard-working all one's life
> 
> 
Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: CPE-163785
Department: Support netCDF
Priority: Normal
Status: Closed