Re: [netcdfgroup] bm_file...but for netCDF-4?

I believe the NetCDF Command Operators (NCO) have a way of trying different
chunksizes. bm_file is challenging, but you can look in the code to see how
it's interpreting the -c option (I believe, varid : shuffle : deflate :
chunksizes).

The weird metadata you are showing is not a problem. Those are "hidden"
attributes that netCDF uses behind the scenes.

When you create a netCDF-4 file with NC_CLASSIC in the create mode, it
creates a special attribute that tells netCDF not to let the user do any
enhanced model stuff on that file. That's what bm_file is checking for. You
can just comment out the check and recompile to try a file that was not
created with NC_CLASSIC.

Probably you have already tried setting the deflate level. Trying different
chunksizes with your data is the best way to see how it's impacting
performance. (If you have an unlimited dimension, and are accepting the
default chunking, that can be very slow. Try explicitly setting the
chunking.)

Ed Hartnett


On Thu, Oct 22, 2020 at 1:18 PM Thompson, Matt (GSFC-610.1)[SCIENCE SYSTEMS
AND APPLICATIONS INC] via netcdfgroup <netcdfgroup@xxxxxxxxxxxxxxxx> wrote:

> All,
>
>
>
> A simple question. We've recently been encountering some rather slow
> compression speeds with netCDF files and we are trying to figure out what
> might be the cause. We aren't sure if it's our disks, or maybe we are
> chunking the files incorrectly, or...whatever, so I looked around and found
> this page:
>
>
>
> https://www.unidata.ucar.edu/software/netcdf/docs/netcdf_perf_chunking.html
>
>
>
> and the bm_file program. I whipped up a netCDF-C with --enable-benchmarks
> and built it and tried a test with a "not sure how to use it" "-c" option
> (still puzzling over what it means) on a file of mine and:
>
>
>
> $ bm_file -v -h -d -f 3 -o test.nc4 -c 0:-1:0:4:4:4
> stock-gcm-2020Oct15-1day-c180.geosgcm_prog.20150415_1800z.nc4
>
> copying stock-gcm-2020Oct15-1day-c180.geosgcm_prog.20150415_1800z.nc4 to
> test.nc4 on 1 processors with endianness 0 and...
>
> slow_count=10, doublecheck=1
>
> 0: reading metadata took 122104 micro-seconds.
>
> Sorry! Unexpected result, bm_file.c, line: 506 - NetCDF: Attempting
> netcdf-3 operation on netcdf-4 file
>
>
>
> And, yeah, if you look at the source:
>
>
>
>     /* Only classic model files may be used as input. */
>
>     if ((ret = nc_inq_format(ncid_in, in_format)))
>
>         ERR1(ret);
>
>     if (*in_format == NC_FORMAT_NETCDF4)
>
>         ERR1(NC_ENOTNC3);
>
>
>
> but the top also says:
>
>
>
>   This program only works on classic model netCDF files. That is,
>
>   groups, user-defined types, and other new netCDF-4 features are not
>
>   handled by this program. (Input files may be in netCDF-4 format, but
>
>   they must conform to the classic model for this program to work.)
>
>
>
> Now the file I'm trying this on is pretty boring. We don't do much
> exciting in like 99% of our netCDF output unless maybe we have some weird
> metadata:
>
>
>
> // global attributes:
>
>            :_NCProperties = "version=2,netcdf=4.7.4,hdf5=1.10.6," ;
>
>            :_SuperblockVersion = 0 ;
>
>            :_IsNetcdf4 = 1 ;
>
>            :_Format = "netCDF-4" ;
>
>
>
> Any ideas what I might be able to do to make this file work with bm_file?
> Or, perhaps, is there a netCDF-4 equivalent of it?
>
>
>
> Matt
>
>
>
> PS: Or, I suppose, any hints/thoughts on how to make deflation faster in
> netCDF (netCDF-Fortran, technically)?
>
> --
>
> Matt Thompson, SSAI, Ld Scientific Programmer/Analyst
>
> NASA GSFC,    Global Modeling and Assimilation Office
>
> Code 610.1,  8800 Greenbelt Rd,  Greenbelt,  MD 20771
>
> Phone: 301-614-6712                 Fax: 301-614-6246
>
> *http://science.gsfc.nasa.gov/sed/bio/matthew.thompson
> <http://science.gsfc.nasa.gov/sed/bio/matthew.thompson>*
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web.  Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> netcdfgroup mailing list
> netcdfgroup@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit:
> https://www.unidata.ucar.edu/mailing_lists/
>
  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the netcdfgroup archives: