Re: NSDL Metadata for THREDDS dataset

  • Subject: Re: NSDL Metadata for THREDDS dataset
  • From: Ted Habermann <Ted.Habermann@xxxxxxxx>
  • Date: Wed, 26 Dec 2001 15:09:07 -0700
Hello all,

As a data provider I must admit that I am somewhat alarmed by the potential for 
having to provide
multiple metadata representations for thousands (millions) of datasets. I noted 
in John Weatherley's
seminar on OAI metadata harvesting that the original source materials were 
DCXML files (see
http://dublincore.org/documents/2000/07/14/dcmes-xml/ for a discussion and 
DTD).  I could easily
imagine a situation where I had to create and maintain these files and a 
parallel set for FGDC
representations. This is, of course, relatively straightforward in a world of 
static metadata. DC
seems much more static than FGDC, so maybe this is not a huge problem. In a 
dynamic metadata
situation where data providers, data managers, or data processing systems are 
interacting with the
metadata on essentially random time schedules, seems like it could turn into a 
massive file
management headache. BTW, John Caron's seminar suggested that I was going to 
need a bunch of other
XML files hanging around to define collections. This only adds to the problem.

My approach to avoiding this problem is to try to produce multiple metadata 
representations from a
single source (in my case a relational database). The content of that database 
is essentially FGDC,
although I expect that it will soon migrate to ISO 19115. What's important 
about this is that it is a
"fatter" standard (it has more stuff). The desire to have more stuff is what 
led me to agree with
Jeff's earlier e-mail suggesting that it might be difficult to recover from 
starting small. In that
case, the problem Jeff and Stefano have discussed becomes one of  revealing 
different subsets of
information from the database in response to different requests.

In any case, I was driven to explore the DC-FGDC crosswalk in the hope that I 
could easily create DC
from FGDC (what the heck, it's the day after Christmas and I'm at work!). I was 
interested to see
that this crosswalk was not referenced in the big list of crosswalks
(http://www.ukoln.ac.uk/metadata/interoperability/). Is there an obvious reason 
for that? My initial
efforts are in the attached file. It looks to me like this crosswalk is rather 
straightforward. The
most serious omission is the identifier field. As far as I know, FGDC does not 
include this concept,
unfortunately. Could be added as an extension. I also think OGC is working on 
an interesting approach
to unique identifiers.

This crosswalk may raise some interesting questions about the list of metadata 
elements Ben presented
(http://www.smete.org/nsdl/workgroups/standards/current_element_set.html BTW, 
the definition of the
identifier element is broken in that list). When one follows the crosswalk to 
FGDC land, one many
times lands in the middle of a section that has a bunch of required elements 
that are not included in
DC. This, of course, makes going from DC to FGDC impossible, but it raises the 
question of whether
NSDL might want to beef up this list. What good are keywords from a controlled 
vocabulary if you
don't know what controlled vocabulary it is? or identifiers from a specific 
context if you don't know
what context it is?

I am a real neophyte in this business, so I could be making some simple errors. 
In any case, it is
also a rough draft!

Happy New Year to all!
Ted Habermann

<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40";>

<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 10">
<link rel=File-List href="Crosswalks_files/filelist.xml">
<style id="Crosswalks_6945_Styles">
<!--table
        {mso-displayed-decimal-separator:"\.";
        mso-displayed-thousand-separator:"\,";}
.xl156945
        {padding-top:1px;
        padding-right:1px;
        padding-left:1px;
        mso-ignore:padding;
        color:windowtext;
        font-size:10.0pt;
        font-weight:400;
        font-style:normal;
        text-decoration:none;
        font-family:Arial;
        mso-generic-font-family:auto;
        mso-font-charset:0;
        mso-number-format:General;
        text-align:general;
        vertical-align:bottom;
        mso-background-source:auto;
        mso-pattern:auto;
        white-space:nowrap;}
.xl226945
        {padding-top:1px;
        padding-right:1px;
        padding-left:1px;
        mso-ignore:padding;
        color:windowtext;
        font-size:12.0pt;
        font-weight:700;
        font-style:normal;
        text-decoration:none;
        font-family:"Times New Roman", serif;
        mso-font-charset:0;
        mso-number-format:General;
        text-align:center;
        vertical-align:bottom;
        border:.5pt solid black;
        background:silver;
        mso-pattern:auto none;
        white-space:normal;}
.xl236945
        {padding-top:1px;
        padding-right:1px;
        padding-left:1px;
        mso-ignore:padding;
        color:windowtext;
        font-size:10.0pt;
        font-weight:400;
        font-style:normal;
        text-decoration:none;
        font-family:Arial;
        mso-generic-font-family:auto;
        mso-font-charset:0;
        mso-number-format:General;
        text-align:general;
        vertical-align:top;
        mso-background-source:auto;
        mso-pattern:auto;
        white-space:normal;}
.xl246945
        {padding-top:1px;
        padding-right:1px;
        padding-left:1px;
        mso-ignore:padding;
        color:windowtext;
        font-size:12.0pt;
        font-weight:700;
        font-style:normal;
        text-decoration:none;
        font-family:"Times New Roman", serif;
        mso-font-charset:0;
        mso-number-format:General;
        text-align:center;
        vertical-align:top;
        border:.5pt solid black;
        background:silver;
        mso-pattern:auto none;
        white-space:normal;}
.xl256945
        {padding-top:1px;
        padding-right:1px;
        padding-left:1px;
        mso-ignore:padding;
        color:windowtext;
        font-size:10.0pt;
        font-weight:400;
        font-style:normal;
        text-decoration:none;
        font-family:Arial;
        mso-generic-font-family:auto;
        mso-font-charset:0;
        mso-number-format:General;
        text-align:general;
        vertical-align:top;
        mso-background-source:auto;
        mso-pattern:auto;
        white-space:nowrap;}
-->
</style>
<title>DC- FGDC Crosswalk</title>
</head>

<body>
<!--[if !excel]>&nbsp;&nbsp;<![endif]-->
<!--The following information was generated by Microsoft Excel's Publish as Web
Page wizard.-->
<!--If the same item is republished from Excel, all information between the DIV
tags will be replaced.-->
<!----------------------------->
<!--START OF OUTPUT FROM EXCEL PUBLISH AS WEB PAGE WIZARD -->
<!----------------------------->

<div id="Crosswalks_6945" align=center x:publishsource="Excel">

<h1 style='color:black;font-family:Arial;font-size:14.0pt;font-weight:800;
font-style:normal'>DC- FGDC Crosswalk</h1>

<table x:str border cellpadding=0 cellspacing=0 width=958 
style='border-collapse:
 collapse;table-layout:fixed;width:719pt'>
 <col width=425 
style='mso-width-source:userset;mso-width-alt:15542;width:319pt'>
 <col class=xl256945 width=533 style='mso-width-source:userset;mso-width-alt:
 19492;width:400pt'>
 <tr height=21 style='height:15.75pt'>
  <td height=21 class=xl226945 width=425 style='height:15.75pt;width:319pt'>DC
  Element</td>
  <td class=xl246945 width=533 style='border-left:none;width:400pt'>FGDC 
???</td>
 </tr>
 <tr height=17 style='height:12.75pt'>
  <td height=17 class=xl236945 width=425 
style='height:12.75pt;width:319pt'>Title:
  A name given to the resource.</td>
  <td class=xl256945>1.1.8.4 Title -- the name by which the data set is 
known.</td>
 </tr>
 <tr height=51 style='height:38.25pt'>
  <td height=51 class=xl236945 width=425 
style='height:38.25pt;width:319pt'>Creator:
  An entity primarily responsible for making the content of the resource.</td>
  <td class=xl236945 width=533 style='width:400pt'>1.1.8.1 Originator -- the
  name of an organization or individual that developed the data set. If the
  name of editors or compilers are provided, the name must be followed by
  &quot;(ed.)&quot; or &quot;(comp.)&quot; respectively.</td>
 </tr>
 <tr height=34 style='height:25.5pt'>
  <td height=34 class=xl236945 width=425 
style='height:25.5pt;width:319pt'>Subject:
  Typically, a Subject will be expressed as keywords, key phrases or
  classification codes that describe a topic of the resource.</td>
  <td class=xl236945 width=533 style='width:400pt'>1.6.1.2 Theme Keywords<br>
    1.6.3.2 Stratum Keywords</td>
 </tr>
 <tr height=51 style='height:38.25pt'>
  <td height=51 class=xl236945 width=425 
style='height:38.25pt;width:319pt'>Description:
  An account of the content of the resource.</td>
  <td class=xl236945 width=533 style='width:400pt'>1.2.1 Abstract -- a brief
  narrative summary of the data set.<br>
    1.2.2 Purpose -- a summary of the intentions with which the data set was
  developed.<br>
    1.2.3 Supplemental Information -- other descriptive information about the
  data set.</td>
 </tr>
 <tr height=34 style='height:25.5pt'>
  <td height=34 class=xl236945 width=425 
style='height:25.5pt;width:319pt'>Contributor:
  An entity responsible for making contributions to the content of the
  resource.</td>
  <td class=xl236945 width=533 style='width:400pt'>2.5.1 Source Information --
  list of sources and a short discussion of the information contributed by
  each.</td>
 </tr>
 <tr height=51 style='height:38.25pt'>
  <td height=51 class=xl236945 width=425 
style='height:38.25pt;width:319pt'>Publisher:
  An entity responsible for making the resource available</td>
  <td class=xl236945 width=533 style='width:400pt'>1.1.8.8.2 Publisher -- the
  name of the individual or organization that published the data set.<br>
    6.1 Distributor -- the party from whom the data set may be obtained.</td>
 </tr>
 <tr height=34 style='height:25.5pt'>
  <td height=34 class=xl236945 width=425 style='height:25.5pt;width:319pt'>Date:
  A date associated with an event in the life cycle of the resource.</td>
  <td class=xl236945 width=533 style='width:400pt'>1.1.8.2 Publication Date --
  the date when the data set is published or otherwise made available for
  release.</td>
 </tr>
 <tr height=34 style='height:25.5pt'>
  <td height=34 class=xl236945 width=425 style='height:25.5pt;width:319pt'>Type:
  The nature or genre of the content of the resource.</td>
  <td class=xl236945 width=533 style='width:400pt'>8.6 Geospatial Data
  Presentation Form -- the mode in which the geospatial data are represented.
  Potential Controlled Vocabulary Problems.</td>
 </tr>
 <tr height=17 style='height:12.75pt'>
  <td height=17 class=xl236945 width=425 
style='height:12.75pt;width:319pt'>Format:
  The physical or digital manifestation of the resource.</td>
  <td class=xl236945 width=533 style='width:400pt'>6.4.2.1.1 Format Name -- the
  name of the data transfer format.</td>
 </tr>
 <tr height=34 style='height:25.5pt'>
  <td height=34 class=xl236945 width=425 
style='height:25.5pt;width:319pt'>Identifier:
  An unambiguous reference to the resource within a given context.</td>
  <td class=xl236945 width=533 style='width:400pt'></td>
 </tr>
 <tr height=34 style='height:25.5pt'>
  <td height=34 class=xl236945 width=425 
style='height:25.5pt;width:319pt'>Source:
  A Reference to a resource from which the present resource is derived.</td>
  <td class=xl236945 width=533 style='width:400pt'>1.1.8.11 Larger Work
  Citation -- the information identifying a larger work in which the data set
  is included.</td>
 </tr>
 <tr height=17 style='height:12.75pt'>
  <td height=17 class=xl236945 width=425 
style='height:12.75pt;width:319pt'>Language:
  A language of the intellectual content of the resource.</td>
  <td class=xl236945 width=533 style='width:400pt'>1.2.3 Supplemental
  Information -- other descriptive information about the data set.</td>
 </tr>
 <tr height=68 style='height:51.0pt'>
  <td height=68 class=xl236945 width=425 
style='height:51.0pt;width:319pt'>Relation:
  A reference to a related resource. These relations are expressed by
  qualifiers: IsVersionOf, HasVersion, IsReplacedBy, Replaces, Requires,
  IsPartOf, HasPart, IsReferencedBy, IsFormatOf, HasFormat</td>
  <td class=xl236945 width=533 style='width:400pt'>Many of these relations
  could end up being expressed as parts of a lineage chain in section 2.5.
  Others would be expressed as part of the Larger Work Citation (1.1.8.11).</td>
 </tr>
 <tr height=68 style='height:51.0pt'>
  <td height=68 class=xl236945 width=425 
style='height:51.0pt;width:319pt'>Coverage:
  The extent or scope of the content of the resource. Coverage will typically
  include spatial location (a place name or geographic coordinates), temporal
  period (a period label, date, or date range) or jurisdiction (such as a named
  administrative entity).</td>
  <td class=xl236945 width=533 style='width:400pt'>1.6.2.2 Place Keywords<br>
    1.6.4.2 Temporal Keywords</td>
 </tr>
 <tr height=119 style='height:89.25pt'>
  <td height=119 class=xl236945 width=425 style='height:89.25pt;width:319pt'
  x:str="Rights: ">Rights:<span style='mso-spacerun:yes'> </span></td>
  <td class=xl236945 width=533 style='width:400pt'>1.7 Access Constraints --
  restrictions and legal prerequisites for accessing the data set. These
  include any access constraints applied to assure the protection of privacy or
  intellectual property, and any special restrictions or limitations on
  obtaining the data set.<br>
    1.8 Use Constraints -- restrictions and legal prerequisites for using the
  data set after access is granted. These include any use constraints applied
  to assure the protection of privacy or intellectual property, and any special
  restrictions or limitations on using the data set.</td>
 </tr>
  <![if supportMisalignedColumns]>
 <tr height=0 style='display:none'>
  <td width=425 style='width:319pt'></td>
  <td width=533 style='width:400pt'></td>
 </tr>
 <![endif]>
</table>

</div>


<!----------------------------->
<!--END OF OUTPUT FROM EXCEL PUBLISH AS WEB PAGE WIZARD-->
<!----------------------------->
</body>

</html>
  • 2001 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: