[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[netCDF #EAX-879650]: Re: [ESIP-CF] CF Cluster telecon today



Hi Upendra,

>   Thank you for the mail. The documentation on "Groups" says that it can be
> used for "Organizing a large number of variables".  I have a netCDF-3 file
> which contains several  variables on history of data processing methods
> applied to the data:
> 
>         char hist_identcode(num_hist, string2) ;
> hist_identcode:long_name = "History Identification Code" ;
> char hist_prccode(num_hist, string4) ;
> hist_prccode:long_name = "History Processing Code" ;
> hist_prccode:references = "
> http://www.nodc.noaa.gov/GTSPP/document/codetbls/gtsppcode.html#PRC"; ;
> hist_prccode:comment = "Identifies the procedure through which the data
> passed." ;
> char hist_version(num_hist, string4) ;
> hist_version:long_name = "History Processing Version" ;
> hist_version:comment = "Identifies the version of the software through
> which the data passed." ;
> char hist_prcdate(num_hist, string8) ;
> hist_prcdate:long_name = "History Processing Date" ;
> hist_prcdate:comment = "Records the date as YYYYMMDD that this history
> record was created." ;
> char hist_actcode(num_hist, string2) ;
> hist_actcode:long_name = "History Action Code" ;
> hist_actcode:references = "
> http://www.nodc.noaa.gov/GTSPP/document/codetbls/gtsppcode.html#PC_HIST"; ;
> hist_actcode:comment = "Identifies the action taken against the data by the
> software." ;
> char hist_actparm(num_hist, string4) ;
> hist_actparm:long_name = "History Action Parm" ;
> hist_actparm:references = "
> http://www.nodc.noaa.gov/GTSPP/document/codetbls/gtsppcode.html#PC_PARM"; ;
> hist_actparm:comment = "Identifies the measured variable affected by the
> action." ;
> char hist_auxid(num_hist, string8) ;
> hist_auxid:long_name = "History Auxilary Identification" ;
> hist_auxid:comment = "Normally this is the depth at which the value of a
> variable was acted upon by the software." ;
> char hist_ovalue(num_hist, string10) ;
> hist_ovalue:long_name = "History Original Value" ;
> hist_ovalue:comment = "The original value before being acted upon by
> software." ;
> 
> 
> A question came up on if netCDF-4 could be used to orginize this
> information better. And what would be a good netCDF-4 construct to store
> such information - groups or compound types. I was inclined on using
> compound types. What would you suggest to this? Please help.

In netCDF-3 or netCDF-4, an attribute can't have attributes, only
variables can have attributes.  Similarly In netCDF-4, there is no way
(yet) to directly assign attributes to members of compound types.  But
there is a proposed netCDF-4 convention for how to associate
attributes with members of a compound type:

  
http://www.unidata.ucar.edu/netcdf/papers/nc4_conventions.html#Convention_for_Assigning_Attri

So, if I understand the intent, here's how it might be done with
compound types, if all the members are of type "string":

 types:
   compound hist_t {
     string identcode ;
     string prc_code ;
     string version ;
     string prcdate ;
     string actcode ;
     string actparm ;
     string auxid ;
     string ovalue ;
   };
 dimensions:
   num_hist = unlimited ;  // or however many histories you have
 variables:
   hist_t history(num_hist) ;
        hist_t history:long_name = 
          {"History Identification Code", 
           "History Processing Code",
           "History Processing Version", 
           "History Processing Date",
           "History Action Code", 
           "History Action Parm", 
           "History Auxiliary Identification", 
           "History Original Value"
          } ; 
        hist_t history:references =  
          {"", 
           
"http://www.nodc.noaa.gov/GTSPP/document/codetbls/gtsppcode.html#PRC";, 
           "", 
           "", 
           
"http://www.nodc.noaa.gov/GTSPP/document/codetbls/gtsppcode.html#PC_HIST";, 
           
"http://www.nodc.noaa.gov/GTSPP/document/codetbls/gtsppcode.html#PC_PARM";, 
           "", 
           ""
          } ;
        hist_t history:comments = 
          {"", 
           "Identifies the procedure through which the data passed.", 
           "Identifies the version of the software through which the data 
passed.", 
           "Records the date as YYYYMMDD that this history record was created.",
           "Identifies the action taken against the data by the software.",
           "Identifies the measured variable affected by the action.",
           "Normally this is the depth at which the value of a variable was 
acted upon by the software.",
           "The original value before being acted upon by software."
          } ;
 data:
   history = {
        {"IDENTCODE","PRCCODE","VERSION","PRCDAT","ACTCODE", "ACTPARM", 
"AUXID", "OVALUE"},
        {"IDENTCODE","PRCCODE","VERSION","PRCDAT","ACTCODE", "ACTPARM", 
"AUXID", "OVALUE"},
          ...
       }

The capitalized tokens above such as IDENTCODE shuld be replaced by
the actual corresponding data for the history variable members.

If all the members of the hist_t compound type are not strings
(e.g. maybe the version is float and identcode and auxid are ints),
then the representation gets more complex, requiring two user-defined
types, one for the variable data and one for the string attributes:

 types:
   compound hist_t {
     int identcode ;
     string prc_code ;
     float version ;
     string prcdate ;
     string actcode ;
     string actparm ;
     int auxid ;
     string ovalue ;
   };
   compound hist_att_t {
     string identcode ;
     string prc_code ;
     string version ;
     string prcdate ;
     string actcode ;
     string actparm ;
     string auxid ;
     string ovalue ;
   };

and all the attributes above are now of type "hist_att_t" instead of
hist_t, with values of the variable something like:

 data:
   history = {
        {12345,"PRCCODE",1.1,"PRCDAT","ACTCODE", "ACTPARM", 54321, "OVALUE"},
        {67890,"PRCCODE",2.0,"PRCDAT","ACTCODE", "ACTPARM", 9876, "OVALUE"},
          ...
       }

On the other hand, using Groups you could store each history in a
named Group such as "Hist0", "Hist1", etc., using the number to encode
the "num_hist" dimension in your example.  Note that you can't treat
the Groups as an array and access the nth one, as you can variables
and attributes, you just have to encode the index in the Group name.

An organization with Groups, one for each history, might look something like

 Group Hist0 {   // history metadata corresponding to num_hist=0
 variables:
     int identcode ;
        identcode:long_name = "History Identification Code" ;
     string prccode ;
         prccode:long_name = "History Processing Code" ;
         prccode:references = " 
http://www.nodc.noaa.gov/GTSPP/document/codetbls/gtsppcode.html#PRC"; ;
         prccode:comment = "Identifies the procedure through which the data 
passed." ;
     float version ;
         version:long_name = "History Processing Version" ;
         version:comment = "Identifies the version of the software through 
which the data passed." ;
     string prcdate ;
         // etc. for other attributes ...
     string actcode ;
     string actparm ;
     int auxid ;
     string ovalue ;
 data:
     identcode = 12345;
     prccode = "PRCCODE0";
  ...
 }
 Group Hist1 {   // history metadata corresponding to num_hist=1
 variables:
     int identcode ;
        identcode:long_name = "History Identification Code" ;
     string prccode ;
         prccode:long_name = "History Processing Code" ;
         prccode:references = " 
http://www.nodc.noaa.gov/GTSPP/document/codetbls/gtsppcode.html#PRC"; ;
         prccode:comment = "Identifies the procedure through which the data 
passed." ;
     float version ;
         version:long_name = "History Processing Version" ;
         version:comment = "Identifies the version of the software through 
which the data passed." ;
     string prcdate ;
         // etc. for other attributes ...
     string actcode ;
     string actparm ;
     int auxid ;
     string ovalue ;
 data:
     identcode = 12345;
     prccode = "PRCCODE0";
  ...
 }

... etc. for as many history groups as you have.  This is a little
closer to netCDF-3 use of attributes and may be easier for humans to
see connections, but it's arguably harder for generic software to
figure out than the example with compound structures, becasue it
relies on specific/local Group name conventions.  The former relies on
association by counting which value is associated with which structure
member, and computers are good at that.

So I think I agree with you that netCDF-4 compound types might be a
better representation, because the necessary conventions are already
available (though not approved by CF).

--Russ

Russ Rew                                         UCAR Unidata Program
address@hidden                      http://www.unidata.ucar.edu



Ticket Details
===================
Ticket ID: EAX-879650
Department: Support netCDF
Priority: Normal
Status: Closed