Metadata Extraction

Many datasets contain metadata useful for data discovery. For those datasets that do, extraction of the metadata can feed directly into an enhanced catalog. Automatic metadata extraction generally only works for datasets that follow known conventions or have been mapped into a CDM scientific datatype.

The netCDF-java/CDM library contains some code for extracting information from known conventions and datatypes and then representing it in a THREDDS catalog. For example, some of the metadata for the NCEP model data on the motherlode server was extracted in this way. Here is some of the variable information from the NCEP GFS Alaska 191km model run:
  <variables vocabulary="GRIB-1">
<variable name="Absolute_vorticity" vocabulary_name="Absolute vorticity" units="1/s" vocabulary_id="1,7,2,41">
Absolute vorticity @ isobaric2
</variable>
<variable name="Precipitable_water" vocabulary_name="Precipitable water" units="kg/m^2" vocabulary_id="1,7,2,54">Precipitable water @ entire_atmosphere</variable>
<variable name="Pressure" vocabulary_name="Pressure" units="Pa" vocabulary_id="1,7,2,1">Pressure @ surface</variable>
<variable name="Relative_humidity" vocabulary_name="Relative humidity" units="%" vocabulary_id="1,7,2,52">Relative humidity @ isobaric3</variable>
<variable name="Surface_lifted_index" vocabulary_name="Surface lifted index" units="K" vocabulary_id="1,7,2,131">Surface lifted index @ surface</variable>
<variable name="Temperature" vocabulary_name="Temperature" units="K" vocabulary_id="1,7,2,11">Temperature @ isobaric1</variable>
<variable name="Total_precipitation" vocabulary_name="Total precipitation" units="kg/m^2" vocabulary_id="1,7,2,61">Total precipitation @ surface</variable>
<variable name="u_wind" vocabulary_name="u wind" units="m/s" vocabulary_id="1,7,2,33">u wind @ isobaric</variable>
<variable name="v_wind" vocabulary_name="v wind" units="m/s" vocabulary_id="1,7,2,34">v wind @ isobaric</variable>
...
 </variables>

NetCDF Attribute Convention for Data Discovery

Besides known data conventions a draft convention specifically for data discovery is also under development.