[jwlong@xxxxxxxxxxxxxxxxxxxx: ]

I'm forwarding the following reply from Jeff Lng about SILO, for your
information ...


Due to problems on my end, the following email message from two weeks
ago got bounced back, and I didn't notice it until today. Sorry about
the delay.

Russ,                                                       May 7, 1992

I appreciate your comments and questions. I am mailing you the SILO
document, just so you have a complete reference. I am happy to answer any
questions via email, though, since that is easier.

Since you do not yet have the document, let me fill you in on some of the
background behind SILO. SILO is currently implemented on top of the
Portable Database Library (PDBLib) written by Stewart Brown of LLNL. We did
this as a stop-gap measure; our long-term goal is to drop the SILO library
altogether, and use the HDF/netCDF merge (assuming that the capabilities
provided by the SILO extensions are available.)

My immediate goal is to have the HDF/netCDF community agree on a directory
and object capability (and programming interface). When that is done, I
will modify the SILO interface to match the agreed upon interface. We will
then use the SILO/PDBLib combination until such time that the HDF/netCDF
project has a library for us to use. We have bought into the idea of a
standard INTERFACE, and are willing to switch underlying databases.

I agree with your assessment that directories are more fundamental. I will
try to answer the questions regarding them first.  In SILO, every file is
born with a root directory. At present, this directory is called "RootDir",
and its ID is 0. Since SILO is trying to provide a Unix-like hierarchy,
though, I'm considering changing the root directory name to "/".  So, to
answer your question, there IS a default directory even if the user has not
explicitly called ncdirdef().

I have not yet defined a Fortran interface for the directory functions. It
appears that netCDF is using at most six characters for Fortran subroutine
names, which means that the various primitives are limited to one distinct
character ('d' for dimension, 'v' for variable, etc.) Directories would
therefore require a notation other than 'd'; perhaps 'r':

        ncrdef, ncrget, ncrid,  ncrinq, ncrlst, ncrset

The term 'object' is indeed an overused term. I propose that this
capability be referred to as 'groups' from here on. My previous message was
probably unclear regarding some aspects of groups. Let me try to clarify.

Groups within SILO are treated like dimensions, variables, and attributes
in that they are scoped to directories.  That is, a SILO file can contain
multiple objects with the same name, provided they appear in different
directories. (In SILO, only directory ID's are global.) Groups are unique
in that they can be composed of components which reside in other
directories. Group components can be shared between multiple groups (this
is very useful, and frequently used.)

To answer your questions regarding groups directly:
        o The 'type' of an object is like a tag in the HDF world. It 
          indicates what the group contains; current group types are:
          quad-mesh, unstructured-mesh, and so on. SILO itself does not
          distinguish between the different types of groups; we have a
          higher level of functions (called SLIDE) which does that.
        o Group ID's are unique only within directories.
        o We have not yet encountered a need to edit (add to, delete from)
        o The same component can be in multiple groups.

Here is a diagram of what a group might look like:

   Group Name   = "Sample"
   Group Type   = SLIDE_QUADMESH
   # Components = 5

   Component   Component   Component   Component
     Name        Type         ID        Parent
   "X Coords"    var          15          1     {2D var, in dir 1}
   "Y Coords"    var          16          2     {2D var, in dir 2}
   "Num Dims"    dim           8          0     {dim with size = 2}
   "Dims"        var           2          0     {array of dim values}
   "Coord Sys"   var           6          0     {used like attribute}

<< Note that the component types are actually defined constants, and will
   be of type integer.>>

I am intrigued by your suggestion of using attributes to define groups,
rather than extending the interface. Because there can be multiple objects
with the same name within a file, I don't believe we could use global
attributes to describe a group. Maybe I'm missing something. Anyway, I have
come up with a couple of variations on the attribute scheme which I'd like
to throw out for discussion. I will use the object described above for
illustrating each.

1.  Adopt a convention such that variables whose name begins with "GROUP_"
are in fact group variables. The value of the variable would be some kind
of string representation which describes the dimensions, variables, and
groups which comprise the group. An attribute could be used for defining
the type. For example:

        GROUP_Sample = "X Coords,var,15,1;Y Coords,var,16,2;..."

2. Like 1 above, but the value of the variable would be a scalar containing 
the type of group (e.g., SLIDE_QUADMESH). Attributes of this variable
whose names begin with "GROUP_" would define the group components. This
could be done in one of several ways, including:

        Variable "GROUP_Sample" = SLIDE_QUADMESH

        Attribute Name           Attribute Value
        --------------          ---------------
  a.    "GROUP_X Coords_parent"        1
        "GROUP_X Coords_id"            15
        "GROUP_X Coords_type"          var
              . . .

  b.    "GROUP_X Coords"           {1,15,var}
        . . .

  c.    "GROUP_X Coords_var_dir1"       15

3. Rather than using attributes, use a pair of variables to define a group.
One variable would define the component names, the other variable would
define the remaining component data. For example:

        "GROUPNAMES_Sample"  ";X Coords;Y Coords;Num Dims;Dims;CoordSys;"
        "GROUPDATA_Sample"   {var,15,1,var,16,2,dim,8,0,var,2,0,var,6,0}

The names are packed into a single character array, where the first
character of the array is to be used as the field delimiter (';' in the
example.) The other data is stored in either a nx3 array, or a vector of
length n*3.

Regardless of how objects are represented, there is still the issue of
whether an interface is provided which builds these group variables, or if
the user does it explicitly (perhaps via a higher-level interface).

Any comments?


Regarding unlimited dimensions: we do not currently use that feature, so I
basically punted when I said that there is only one per file. Whatever
seems reasonable to you and other netCDF users is okay with me. Likewise, I
punted when it came to define vs. data mode. As PDBLib does not really
adhere to that model, I have glossed over the mode issue in the current
version of SILO. I want SILO to do things in netCDF way, however, so I
expect that will change.

As you found, I am not yet on the netcdfgroup mailing list. The netCDF
document I was working from was marked Version 1.06, which apparently is
quite out of date. I will attempt to get the latest document, as well as
join the mailing list. I expect SILO to be changed to correspond to the
latest version.

Jeff Long                           jwlong@xxxxxxxx
PO Box 808, L-35
Livermore, CA  94551

>From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 15 2004 Jul -0600 13:42:24 
Message-ID: <wrxd62xcb1b.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 15 Jul 2004 13:42:24 -0600
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: questions about compression...
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id i6FJgQ3k029379
        for netcdf-hdf-out; Thu, 15 Jul 2004 13:42:26 -0600 (MDT)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id i6FJgPaW029375
        for <netcdf-hdf@unidata>; Thu, 15 Jul 2004 13:42:25 -0600 (MDT)
Organization: UCAR/Unidata
Keywords: 200407151942.i6FJgPaW029375
Lines: 22
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk
Reply-To: netcdf-hdf@xxxxxxxxxxxxxxxx

Howdy HDF5 People!

I am looking at your docs to try and learn more about compression and
what it means.

I see the following:

herr_t H5Pset_deflate (hid_t plist_id, int level)
    These functions set or query the deflate level of dataset creation
    property list plist_id. The H5Pset_deflate sets the compression
    method to H5Z_DEFLATE and sets the compression level to some
    integer between one and nine (inclusive). One results in the
    fastest compression while nine results in the best compression
    ratio. The default value is six if H5Pset_deflate isn't called. 

Does this mean that to compress some chunked dataset, it set it's
compression method to H5Z_DEFLATE and that turns on compression?

I can't seem to find much info about compression, am I missing some?