Hi Charlie: > Hi John and NcML people, > > I hope you are well. > > We are preparing NCO 4.3.9 with the bold claim that its "Output > validates without errors against NcML 2.2 schema." The ncml-schema, lives here: http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2.xsd technically "validate" means XML schema validation which most XML parsers will check for you. > While I think this > is true in letter, I'm unsure this is true in spirit, mainly because > the toolsui interface (which I use to check NCO output for schema > compliance) generates NcML with "_Netcdf4Dimid" elements for which > documentation is scarce and may be outdated. Those elements appear to > be either incomplete or unnecessary---it's clear that they are > dimension IDs, yet only one is recorded even for multidimensional > variables. Still, the toolsui schema does not complain when > _Netcdf4Dimid elements are omitted, as NCO currently does. I think "in spirit" probably means that netcdf-java / CDM library does the right thing. Here, schema validation is only the first layer of that; CF compliance being the next layer and has nothing to do with the XML schema. _Netcdf4Dimid is a "real" attribute in netcdf-4 files, apparently meaning: // on dimension scales, holds a scalar H5T_NATIVE_INT which is the (zero-based) dimension ID for this dimension. used to maintain creation order Its a kludge for using hdf5; im leaving those attribute in in case the user cares about creation order. netcdf C probably removes them, since it puts the dimensions in creation order. > > Should I add _Netcdf4Dimid elements to NCO NcML output? > If so, is there a rule for which dimension to add that element for > in the case of multi-dimensional variables? No you should not. I assume the problem is that you are comparing the output of java and C? I have code when comparing java vs C libraries to ignore them. there are a few other things to ignore also: // added by cdm if (name.equals(CDM.CHUNK_SIZE)) return false; if (name.equals(CDM.FILL_VALUE)) return false; if (name.equals("_lastModified")) return false; // hidden by nc4 if (name.equals(Nc4.NETCDF4_DIMID)) return false; // preserve the order of the dimensions if (name.equals(Nc4.NETCDF4_COORDINATES)) return false; // ?? if (name.equals(Nc4.NETCDF4_STRICT)) return false; where: // special attribute names used by netcdf4 library static public final String NETCDF4_COORDINATES = "_Netcdf4Coordinates"; // only on the multi-dimensional coordinate variables of the netCDF model (2D chars) // appears to hold the dimension ids of the 2 dimensions static public final String NETCDF4_DIMID = "_Netcdf4Dimid"; // on dimension scales, holds a scalar H5T_NATIVE_INT which is the (zero-based) dimension ID for this dimension. // used to maintain creation order static public final String NETCDF4_STRICT = "_nc3_strict"; // global - when using classic model public static final String CHUNK_SIZE = "_ChunkSize"; public static final String FILL_VALUE = "_FillValue"; > In any case, I attach a sample input file and its NcML output > generated by ncks in case you have the time and inclination to check > whether the NcML is truly standards-compliant in a way that only a > human can. Also wondering whether NcML really wants shape="" elements > for scalar variables, which would seem redundant, yet I will go by > your recommendation. shape is not technically required, but the code i think needs it. One could say if not specified, assume scalar. For now, safer to leave it in. > > Also, I rather randomly picked a separator = "*|*" for strings, in > order to avoid generating NcML with ambiguous whitespace separators > for arrays of strings. If there is a preferred string separator, > please let me know. I use "," for readability. but it needs to be something that is not already in one of the strings. To be sure, you should scan the strings first. Otherwise "*|*" is as good as anything. BTW, in your example, reading in_grp.ncml is barfing because g11/string_var is a scalar in the original file, but because there are embedded blanks, and blank is the default seperator, it sees 33 values. So you need the separator. Thanks for your test file, im checking to see what issues it comes up with (just trying to open the NcML in ToolsUI/viewer). for example, CDM doesnt actually support unsigned longs. we just pretend they are signed. ill think about a workaround for Ncml reading. Ill let you know if i see anything else. Regards, John > > Thanks! > c > > p.s. output generated by current ncks snapshot with > ncks --xml in_grp.nc > in_grp.ncml > -- > Charlie Zender, Earth System Sci. & Computer Sci. > University of California, Irvine 949-891-2429 )'( > > Ticket Details =================== Ticket ID: VKJ-807633 Department: Support netCDF Java Priority: Normal Status: Open
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.