Hi Logan, We are aware of the overhead , although there is unfortunately little we can do about it. I assume you are working with either the netCDF-C or netCDF-Fortran library, using the netCDF4 data model/file format. In this case, the file I/O is handled by the downstream HDF5 library. We are looking at other compression alternatives, but the tradeoff between compression and I/O speed is fairly immutable (as I'm sure you're aware). In terms of converting the data to floating point from integer, or adopting any sort of lossy compression; these would benefit netCDF certainly, but we have received a lot of pushback from our community when the topic has been broached in the past. The objection, as I recall, is that they didn't want to lose any of their data. I'm sorry I can't provide a more immediately useful solution; we're hoping to have alternative compression techniques available in the future that will provide a better speed/storage tradeoff. -Ward > To whom it may concern, > > I am currently working on refactoring I/O code for the National Water > Model, which is being ran operationally at NCEP to support hydrologic > prediction for the National Weather Service. Part of this refactoring > involves converting both gridded, and point values from floating point, to > integer values via the scale_factor/add_offset attributes. I am also using > internal NetCDF compression when writing output out. The scale of this > modeling system permits output on 1 km grids across conus, along with a > couple variables on 250 meter grids. The point output is across 2.7 million > river reaches. I have been testing the model with and without internal > compression. In my tests, I have seen that the compression adds a > significant amount of time to I/O. In some cases, up to 25%, with a minimum > of 13% additional I/O time. While for smaller model projects, or research > projects, this may be a value that can be neglected, it does become an > issue in an operational environment. I am wondering if this is an outcome > of internal compression the Unidata NetCDF team is aware of? In some work I > did years ago, we converted output from floating point to integer, and > wrote the output directly to a gzipped file from Fortran via a C wrapper. > In that case, the I/O time was significantly reduced, even though we were > compressing the data. > > Thanks for any input you may have. > > LK > > -- > Logan Karsten > Associate Scientist III > Research Applications Laboratory > National Center for Atmospheric Research > 303-497-2693 > > Ticket Details =================== Ticket ID: HHS-891616 Department: Support netCDF Priority: Normal Status: Closed =================== NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.