netcdf-hdf mailing list is no longer active. The list archives are made available for historical reasons.
Hi, For your information, I'm forwarding this reply I just got from Jeff Long of Lawrence Livvermore Laboratories describing the SILO extensions to netCDF. Jeff's description makes it clear that there is more to his extensions than can be easily represented with attribute and variable-name conventions, as I had earlier characterized these extensions. I haven't responded to this note yet, but Jeff's extensions appear tastefully done. They make some things a bit more complicated (e.g. an open netCDF file now has a "current working directory") and I'm not convinced that adding two primitives (directories and objects) was necessary rather than one, but I think this desrves more study. If you have comments, I can try to incorporate them into a reply. --Russ Russ, I have defined a database API, called SILO, which is based on the netCDF interface. SILO is intended to be fully compatible with existing netCDF applications. What distinguishes SILO are two new 'primitives' it adds to the netCDF model: directories and objects. Both of these extensions were added in a non-obtrusive way; that is, no changes were made to existing netCDF functions. Our current implementation of SILO rests on a local database library, but we are very interested in using the netCDF/HDF merge being done by NCSA. The directory primitive allows the user to organize a database file into a hierarchical structure analogous to the Unix file system. Each directory created in a SILO file can be thought of conceptually as a virtual netCDF file: it has its own dimensions, variables, attributes and so on. An inquiry function will only return the contents of the current directory. In keeping with the netCDF model, however, there is just one set of global attributes, and just one unlimited dimension ID. One difficult decision I had to make regarding directories dealt with identifiers for variables and dimensions. As you know, with netCDF you can determine how many variables there are in a file, and automatically know that their identifiers range from 0 to nvars-1 (for C). When a file contains multiple directories, however, an extra level of complexity is added. What I finally decided was to treat each directory like netCDF treats the file -- within a directory, variable identifiers range from 0 to nvars-1, where nvars is the number of variables IN THAT DIRECTORY. I refer to this scheme as "relative" identifiers. Therefore, to uniquely identify any entity in the file, one needs three items: the parent directory ID, the entity type (variable, dimension) and the entity ID (a variable ID if the entity is a variable, a dimension ID if the entity is a dimension.) My original design called for "absolute" identifiers, which in effect meant that any entity (variable, dimension, etc.) in a file could be uniquely specified with a single identifier. This simplified the interface for the object functions, but required changes to the inquiry functions so that a list of identifiers was returned in addition to the number of variables, dimensions, etc. This was such a big departure from the "natural" netCDF way of doing things that I switched back to relative identifiers. The programming interface for directories is described below: 1. Define new directory (mkdir) ncdirdef (int sid, char *name); 2. Get current directory (pwd) ncdirget (int sid); 3. Get dir ID from name ncdirid(int sid, char *name); 4. Inquire about a directory ncdirinq (int sid, int dirid, char *name, int *parent, int *nchild); 5. List dirs beneath current dir (lsd) ncdirlist (int sid, int dirid, int *ndirs, int dirids); 6. Set the current directory (cd) ncdirset (int sid, int dirid); The function ncdirlist() is necessary because, unlike variables and dimensions, directory identifiers are absolute. It is essential that a single identifier can point to any directory within the entire file. The second extension to the netCDF model provided by SILO is the concept of objects. Objects are simply a mechanism for grouping related information. The components of an object can be variables, dimensions, directories, and even other objects. Components can be in any directory within the file. The programming interface for objects is described below: 1. Define an object. ncobjdef (int sid, char *name, int type, int ncomps); 2. Get object ID from name. ncobjid (int sid, char *name); 3. Inquire about object. ncobjinq (int sid, int objid, char *name, int *type, int *ncomps); 4. Write an object. ncobjput (int sid, int objid, char *cnames, int cids, int ctypes, int cparents); 5. Read an object. ncobjget (int sid, int objid, char *cnames, int cids, int ctypes, int cparents); An object is composed of a name, a type, and four parallel lists describing the components of the object. The lists contain the component names, identifiers, types, and parent IDs. Component names are arbitrary, and do not necessarily match the actual names of the variables or dimensions whose IDs are provided. Note that if absolute IDs were used, the types and parents lists could be eliminated. SILO itself does not impose meaning on the objects within a file. I have a higher level interface which reads and writes certain types of SILO objects. Having used SILO for about a year now, we have found directories and objects to be very useful additions to the netCDF interface. Objects are essential for dealing with compound data such as physics meshes and their related information. Directories have been heavily used by applications which use multi-block meshes; in the past they had to use a flat file structure and employ a naming scheme to differentiate variables -- they called their variables "x_block1", "x_block2", etc. Now they can create a directory called "block1" and in it write the variable with its natural name, "x". Without these extensions, our applications and databases would be much more difficult to maintain. Because the NCSA people have shown an interest in the SILO extensions, I am very eager to get feedback from real netCDF users such as yourself. I am completely willing to make modifications to the programming interface or the underlying model if a more reasonable approach is found. In particular, I am interested in your responses to the following questions: 1. Should the concept of relative IDs be kept, even though it makes it more difficult to specify object components? Should absolute IDs be introduced, even though this departs from the netCDF model? It could be possible to have both schemes simultaneously, and provide a mechanism for mapping between absolute and relative. 2. Is there a better name for objects than 'objects'? Perhaps 'groups'? If you have any comments, ideas, or questions, I can be reached via email at "long6@xxxxxxxx", or my phone number is (510)423-6421. I have a detailed document in paper form which I can send to you if you are interested. Thanks for your help. Jeff Long LLNL