Re: Strings (was: Re: HDF5 bitfields...)

Ed,

> Russ and John, are we going to say that not all types are convertible
> in netCDF? At this moment, any type can be converted into any other,
> but are we going to try that with string? Doesn't seem to make much
> sense...

No, in netCDF there is no conversion supported between text and
numeric types.  As it says in section 3.3 of the Users Guide:

  If the netCDF external type for a variable is char, only character
  data representing text strings can be written to or read from the
  variable. No automatic conversion of text data to a different
  representation is supported.

The only conversions supported are among numeric types.

--Russ
>From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 16 2004 Jul -0600 10:09:36 
Message-ID: <wrxu0w8q6gv.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 16 Jul 2004 10:09:36 -0600
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
In-Reply-To: <200407160427.i6G4RoTQ005204@xxxxxxxxxxxxxxxxxxxxxx>
To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: Re: parallel I/O and netCDF-4
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id i6GG9bt9007626
        for netcdf-hdf-out; Fri, 16 Jul 2004 10:09:37 -0600 (MDT)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
[128.117.140.88])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id i6GG9aaW007622
        for <netcdf-hdf@xxxxxxxxxxxxxxxx>; Fri, 16 Jul 2004 10:09:36 -0600 (MDT)
Organization: UCAR/Unidata
Keywords: 200407161609.i6GG9aaW007622
References: <200407160427.i6G4RoTQ005204@xxxxxxxxxxxxxxxxxxxxxx>
Lines: 153
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk
Reply-To: netcdf-hdf@xxxxxxxxxxxxxxxx

Quincey Koziol <koziol@xxxxxxxxxxxxx> writes:

>     From HDF5's perspective, you have to use H5Pset_fapl_<foo>(params) to
> choose to use a particular file driver to access a file.  Probably something
> like this should be exported/translated out to the netCDF4 layer for users to
> choose which driver to access the file with.
>     Here's the URL for the parallel HDF5 info currently:
>         http://hdf.ncsa.uiuc.edu/HDF5/PHDF5/

I'm seeing three steps to parallel HDF5:

1 - Initialize MPI
2 - When opening/creating the file, set a property in file access
properties. 
3 - Every time reading or writing file, pass a correctly set transfer
property.

Does that seem to sum it up?

But I see below that you are also asking that "these properties must
be set to the same values when they > are used in a parallel program,"

What do you mean by that?

In parallel I/O do multiple processes try and create the file? Or does
one create it, and the rest just open it? Sorry if that seems like a
dumb question!

> 
> > For reading, what does this mean to the API, if anything? 
>     Well, I've appended a list of HDF5 API functions that are required to be
> performed collectively to the bottom of this document (I can't find the link
> on our web-pages).
> 
> > Everyone gets to open the file read-only, and read from it to their
> > heart's content, confident that they are getting the most recent data
> > at that moment. That requires no API changes.
> > 
> > Is that it for readers? Or do they get some special additional
> > features, like notification of data arrival, etc?
>     User's would also need the option to choose to use collective or
> independent I/O when reading or writing data to the file.  That reminds me -
> are y'all planning on adding any wrappers to the H5P* routines in HDF5 which
> set/get various properties for objects?

This is truly an important question that I will treat in it's own
email thread...


> 
>     Quincey
> 
> ==============================================================
> 
> Collective functions:
>     H5Aclose (2)
>     H5Acreate
>     H5Adelete
>     H5Aiterate
>     H5Aopen_idx
>     H5Aopen_name
>     H5Aread (6)
>     H5Arename (A)
>     H5Awrite (3)
> 
>     H5Dclose (2)
>     H5Dcreate
>     H5Dfill (6) (A)
>     H5Dopen
>     H5Dextend (5)
>     H5Dset_extent (5) (A)
> 
>     H5Fclose (1)
>     H5Fcreate
>     H5Fflush
>     H5Fmount
>     H5Fopen
>     H5Funmount
> 
>     H5Gclose (2)
>     H5Gcreate
>     H5Giterate
>     H5Glink
>     H5Glink2 (A)
>     H5Gmove
>     H5Gmove2 (A)
>     H5Gopen
>     H5Gset_comment
>     H5Gunlink
> 
>     H5Idec_ref (7) (A)
>     H5Iget_file_id (B)
>     H5Iinc_ref (7) (A)
> 
>     H5Pget_fill_value (6)
> 
>     H5Rcreate
>     H5Rdereference
> 
>     H5Tclose (4)
>     H5Tcommit
>     H5Topen
> 
>     Additionally, these properties must be set to the same values when they
> are used in a parallel program:
>         File Creation Properties:
>             H5Pset_userblock
>             H5Pset_sizes
>             H5Pset_sym_k
>             H5Pset_istore_k
> 
>         File Access Properties:
>             H5Pset_fapl_mpio
>             H5Pset_meta_block_size
>             H5Pset_small_data_block_size
>             H5Pset_alignment
>             H5Pset_cache
>             H5Pset_gc_references
> 
>         Dataset Creation Properties:
>             H5Pset_layout
>             H5Pset_chunk
>             H5Pset_fill_value
>             H5Pset_deflate
>             H5Pset_shuffle
> 
>         Dataset Access Properties:
>             H5Pset_buffer
>             H5Pset_preserve
>             H5Pset_hyper_cache
>             H5Pset_btree_ratios
>             H5Pset_dxpl_mpio
> 
>     Notes:
>         (1) - All the processes must participate only if this is the last
>             reference to the file ID.
>         (2) - All the processes must participate only if all the file IDs for
>             a file have been closed and this is the last outstanding object 
> ID.
>         (3) - Because the raw data for an attribute is cached locally, all
>             processes must participate in order to guarantee that future
>             H5Aread calls return the correct results on all processes.
>         (4) - All processes must participate only if the datatype is for a
>             committed datatype, all the file IDs for the file have been closed
>             and this is the last outstanding object ID.
>         (5) - All processes must participate only if the number of chunks in
>             the dataset actually changes.
>         (6) - All processes must participate only if the datatype of the
>             attribute a a variable-length datatype (sequence or string).
>         (7) - This function may be called independently if the object ID does
>             not refer to an object that was collectively opened.
> 
>         (A) - Available only in v1.6 or later versions of the library.
>         (B) - Available only in v1.7 or later versions of the library.

>From owner-netcdf-hdf@xxxxxxxxxxxxxxxx 16 2004 Jul -0600 10:16:18 
Message-ID: <wrxpt6wq65p.fsf@xxxxxxxxxxxxxxxxxxxxxxx>
Date: 16 Jul 2004 10:16:18 -0600
From: Ed Hartnett <ed@xxxxxxxxxxxxxxxx>
In-Reply-To: <200407160427.i6G4RoTQ005204@xxxxxxxxxxxxxxxxxxxxxx>
To: netcdf-hdf@xxxxxxxxxxxxxxxx
Subject: wrap HDF5 property lists for netCDF-4?
Received: (from majordo@localhost)
        by unidata.ucar.edu (UCAR/Unidata) id i6GGGJ4r008530
        for netcdf-hdf-out; Fri, 16 Jul 2004 10:16:19 -0600 (MDT)
Received: from rodney.unidata.ucar.edu (rodney.unidata.ucar.edu 
[128.117.140.88])
        by unidata.ucar.edu (UCAR/Unidata) with ESMTP id i6GGGIaW008526
        for <netcdf-hdf@xxxxxxxxxxxxxxxx>; Fri, 16 Jul 2004 10:16:18 -0600 (MDT)
Organization: UCAR/Unidata
Keywords: 200407161616.i6GGGIaW008526
References: <200407160427.i6G4RoTQ005204@xxxxxxxxxxxxxxxxxxxxxx>
Lines: 34
User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-netcdf-hdf@xxxxxxxxxxxxxxxx
Precedence: bulk
Reply-To: netcdf-hdf@xxxxxxxxxxxxxxxx


Quincey raises an interesting question: are we going to somehow wrap
HDF5 property lists? Or shall we expose them to the netCDF-4 user?

For example, the HDF5 file creation process involves two property
lists, one for creation properties and one for access properties.

Now we currently make various assumptions about what is desired when
handling a nc_create call. That is, we just go with default property
lists.

But we probably also want to add a function to netcdf which directly
exposes those property lists, allowing the user to construct and
manipulate them with HDF5 functions, and then taking the property
lists and handing them down to the HDF5 file creation function that
gets called.

This is a bit of a weird mix of HDF5 and netCDF-4, but it will allow
the user to take advantage of new features in HDF5 without updating
any netCDF-4 code. 

Or perhaps this is a bridge too far for Russ and Mike in terms of
mixing the interface.

I note that we face a similar decision about compound types. We could
wrap HDF5 functions in netCDF functions that conform with our current
interface, but why not just use the perfectly good HDF5 interface for
creating compound types, and then taking the hid_t into an netCDF
function?

(Of course then we have to ask: what if they use a HDF5 type not
supported in netCDF, if there are any such types?)

Ed