[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 20050302: netCDF General - Large file problem



>To: address@hidden
>From: "Jim Cowie" <address@hidden>
>Subject: netCDF General - Large file problem
>Organization: RAL
>Keywords: 200503022003.j22K3ZjW025634

Hi Jim,

> I recently discovered that the large netCDF files I have been
> producing are actually not readable past a certain point in the file,
> possibly at the 2GB point. The files are about 3.1GB, were created with
> netCDF 3.5.0 with the  -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
> switches. I always assumed this was working because the program
> generating these files was able to get past 2GB without core dumping.
> Plus, ncdump does not say "File too large" anymore. However, when I
> "ncdump -v variable", where variable is in the latter 1/3 of the file,
> no data is displayed, not even fill values. Also, using either the perl
> or C++ interface to read variables out of this file, I get a read error
> (0 return for get()) for variables in the latter part of the file.
> 
> I will include a CDL of the file below. As far as I can tell, the
> structure of these variables should allow a large file version,
> according to the restrictions on large files under the
> 3.5documentation. I realize that newer large file support is available
> in 3.6, but I'm not ready to go there yet. Thanks for any help,

Could you make one of the files available via FTP or HTTP?  I realize
it will take a long time to drag 3.1 GB across the network, but I
suspect the problem you are seeing is just the use of an old ncdump
binary that was built without large file support, and that the files
are actually fine.  It may be that you are seeing the same problem
with the perl and C++ interfaces: using old installed libraries not
built with large file support.  Even with version 3.5.0, it was
necessary to use the right compile flags to make sure the data beyond
2 GiB was accessible.

You could also test this yourself by using an ncdump built on a 64-bit
platform or an ncdump on a 32-bit platform compiled with large file
support.  It doesn't have to be new source for ncdump, I think it even
worked in 3.5.0 if the compile environment supported large files.  But
I could tell for sure with a new ncdump.

Alternatively, if you have one of the platforms for which we've built
binaries, you could extract the ncdump executable out of one of those
and try it:

  http://my.unidata.ucar.edu/content/software/netcdf/binaries.html

In this case, I would use one of the binaries built since 3.5.1, since
I don't think the 3.5.1 and earlier binaries were necessarily built
with large file support.

> Here is the actual ncdump so you can see what happens when I try to
> dump a variable in the latter part of the file. Admittedly, these
> variables have lots of dimensions, could that be part of the problem?

No, as far as I know there is no problem with using lots of dimensions.

> scrappy:cowie:44>~/bin/ncdump -v cprob_snow 
> /d2/dicast/lt/mod_emp/gfs06_dmos_emp/20050224/gfs06_dmos_emp.20050224.1840.nc 
> | more
> /home/cowie/bin/ncdump: Invalid argument
> netcdf gfs06_dmos_emp.20050224.1840 {
> dimensions:
>         max_site_num = 2300 ;
>         num_eqns = 30 ;
>         var_regressors = 3 ;
>         days = 16 ;
>         fc_times_per_day = 4 ;
>         daily_time = 1 ;
>         weight_vals = 4 ;
> variables:
>         int type ;
>                 type:long_name = "cdl file type" ;
>         double forc_time ;
>                 forc_time:long_name = "time of earliest forecast" ;
>                 forc_time:units = "seconds since 1970-1-1 00:00:00" ;
>         double creation_time ;
>                 creation_time:long_name = "time at which forecast file was 
> created" ;
>                 creation_time:units = "seconds since 1970-1-1 00:00:00" ;
>         int num_sites ;
>                 num_sites:long_name = "number of actual_sites" ;
>         int site_list(max_site_num) ;
>                 site_list:long_name = "forecast site list" ;
>         float T(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 T:long_name = "temperature" ;
>                 T:units = "Celsius" ;
>         float max_T(max_site_num, days, daily_time, num_eqns, var_regressors, 
> weight_vals) ;
>                 max_T:long_name = "maximum temperature" ;
>                 max_T:units = "Celsius" ;
>         float min_T(max_site_num, days, daily_time, num_eqns, var_regressors, 
> weight_vals) ;
>                 min_T:long_name = "minimum temperature" ;
>                 min_T:units = "Celsius" ;
>         float dewpt(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 dewpt:long_name = "dewpoint" ;
>                 dewpt:units = "Celsius" ;
>         float wind_u(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 wind_u:long_name = "u-component of wind" ;
>                 wind_u:units = "meters per second" ;
>         float wind_v(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 wind_v:long_name = "v-component of wind" ;
>                 wind_v:units = "meters per second" ;
>         float wind_speed(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 wind_speed:long_name = "wind speed" ;
>                 wind_speed:units = "meters per second" ;
>         float cloud_cov(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 cloud_cov:long_name = "cloud cover" ;
>                 cloud_cov:units = "percent*100" ;
>         float visibility(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 visibility:long_name = "visibility" ;
>                 visibility:units = "km" ;
>         float prob_fog(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 prob_fog:long_name = "probability of fog" ;
>                 prob_fog:units = "percent*100" ;
>         float prob_thunder(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 prob_thunder:long_name = "probability of thunder" ;
>                 prob_thunder:units = "percent*100" ;
>         float cprob_rain(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 cprob_rain:long_name = "conditional probability of rain" ;
>                 cprob_rain:units = "percent*100" ;
>         float cprob_snow(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 cprob_snow:long_name = "conditional probability of snow" ;
>                 cprob_snow:units = "percent*100" ;
>         float cprob_ice(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 cprob_ice:long_name = "conditional probability of ice" ;
>                 cprob_ice:units = "percent*100" ;
>         float prob_precip06(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 prob_precip06:long_name = "probability of precipitation, 6 
> hr" ;
>                 prob_precip06:units = "percent*100" ;
>         float prob_precip24(max_site_num, days, daily_time, num_eqns, 
> var_regressors, weight_vals) ;
>                 prob_precip24:long_name = "probability of precipitation, 24 
> hr" ;
>                 prob_precip24:units = "percent*100" ;
>         float qpf06(max_site_num, days, fc_times_per_day, num_eqns, 
> var_regressors, weight_vals) ;
>                 qpf06:long_name = "amount of precipitation" ;
>                 qpf06:units = "mm" ;
> data:
> 
>  cprob_snow =

There have also been some ncdump bug fixes since 3.5.0, so it's
possible you are running across an old bug.  I'll check that if I
determine that there's actually nothing wrong with the files.

--Russ