Due to the current gap in continued funding from the U.S. National Science Foundation (NSF), the NSF Unidata Program Center has temporarily paused most operations. See NSF Unidata Pause in Most Operations for details.
# Proposal for Generated Filter Code I am starting work on a new netcdf utility that takes a simplified filter specification and uses it to generate a complete HDF5 filter wrapper [2] plus an NCZarr filter wrapper. The raison d'etre is that the process of building an HDF5 filter wrapper [2,3] from scratch is complex, time-consuming and error prone. Using a code generator is likely to simplify this process. At the very least, it will produce base code that a filter builder can modify to build the desired wrapper. This program is analogous to, say, the yacc parser generator that converts an annotated BNF to a full blown parser. **What I need:** I have a simple prototype working, but I need some community input on this idea. Would anyone use it? Is the proposed specification (Appendix A) reasonably simple to construct? If you want to participate, use this [GitHub discussion](https://github.com/Unidata/netcdf-c/discussions/2288). # Specification Overview The filter specification is written in JSON, although it is highly stylized. It was derived from the NumCodecs [4] format but with significant extensions to support the Netcdf-4/HDF5 wrapper format. A couple of visible extensions with respect to JSON are: 1. Single line comments are supported beginning with '#'. 2. An alternate string delimiter is provided using the '`' character; chosen because occurrences of that delimiter in C code is very uncommon. The basic specification is a JSON dictionary with very specific keys that are used to control code generation.A draft example for specifying the zstandard filter wrapper is shown in Appendix A. The various dictionary keys provide filter information.
* **"id"** -- specifies the NumCodecs name (Zstd) and the HDF5 assigned identifier (32015); it also specifies an alternate preferred name. * **"parameters"** -- a dictionary whose keys are the parameter names as specified by NumCodecs, and the value is a keyword indicating the type of the corresponding parameter. The allowable types are "integer" or "float". or an enumeration (not described here).* **"initialize"** -- the value is a piece of code to initialize the filter before use. * **"finalize"** -- the value is a piece of code to shutdown the filter after all use is complete. * **"prefix"** -- arbitrary code to insert at the front of the filter wrapper; typically used to include filter library specific headers. * **"suffix"** -- arbitrary code to insert at the end of the filter wrapper; typically used to include filter library specific utility functions. * **"encode"** -- a function name plus the code for a user-provided function to invoke the filter's encoding/compression capability; this has a very specific signature. * **"decode"** -- a function name plus the code for a user-provided function to invoke the filter's decoding/decompression capability; this has a very specific signature.
# References[1] [HDF5 Filter Specification](https://support.hdfgroup.org/HDF5/doc/Advanced/DynamicallyLoadedFilters/HDF5DynamicallyLoadedFilters.pdf)<br> [2] [Registered HDF5 Filter Plugins](https://portal.hdfgroup.org/display/support/Registered+Filter+Plugins)<br> [3] [User Contributed Filters](https://support.hdfgroup.org/services/contributions.html#filters)<br>
[4] [NumCodecs](https://numcodecs.readthedocs.io/en/stable/)<br> # Appendix A. Zstandard Draft Example ```` { "id": {"zstd": 32015, "preferred": "zstandard"}, "parameters": [{"level": "integer"}] "encode": ["name": "zstd_compress", "code": # The signature is standardized `size_t zstd_compress(size_t srclen, void* srcbuf, size_t* dstlenp, void** dstbufp, size_t cd_nelmts, const unsigned int* cd_values)
{ int ret = NC_NOERR; size_t dstlen; void* dstbuf; dstlen = (size_t)ZSTD_compressBound(srclen); if(ZSTD_isError(dstlen)) {ret = NC_EFILTER; goto cleanup;} /* Prepare the destination buffer. */if((dstbuf = malloc(dstlen))==NULL) {ret = NC_ENOMEM; goto cleanup;} dstlen = ZSTD_compress(dstbuf, dstlen, srcbuf, srclen, /*level*/cd_values[0]);
if(ZSTD_isError(dstlen)) {ret = NC_EFILTER; goto cleanup;} if(dstlenp) *dstlenp = dstlen; if(dstbufp) *dstbufp = dstbuf; cleanup: return dstsize; }`], "decode": ["name": "zstd_decompress", "code": # The signature is standardized `...`] "prefix": `...`, "suffix": `...` } ````
netcdfgroup
archives: