[thredds] TDS, CDM, Points and Profiles and large datasets

Hi John:

Since the UAF meeting in Seattle I have been giving some thought about how to 
serve some large, important datasets, such as the raw ICOADS observations or 
the WODB observations. While reading over the PointObservation Conventions 
proposal on the CF site, while the proposal makes it clear how I might put data 
into a netcdf file, it doesn't make clear what the interplay might be with a 
service in TDS, and how a possible service might be affected by a very large 
dataset without further structure.

So it seems pretty clear that the ICOADS would be points.  From the example:

dimensions:
  obs = 1234 ;

variables:
  double time(obs) ;
    time:long_name = "time of measurement" ;
    time:units = "days since 1970-01-01 00:00:00" ;
  float lon(obs) ;
    lon:long_name = "longitude of the observation";
    lon:units = "degrees_east";
  float lat(obs) ;
    lat:long_name = "latitude of the observation" ;
    lat:units = "degrees_north" ;
  float alt(obs) ;
    alt:long_name = "vertical distance above the surface" ;
    alt:standard_name = "height" ;
    alt:units = "m";
    alt:positive = "up";
    alt:axis = "Z";

  float humidity(obs) ;
    humidity:long_name = "specific humidity" ;
    humidity:coordinates = "time lat lon alt" ;
  float temp(obs) ;
    temp:long_name = "temperature" ;
    temp:units = "Celsius" ;
    temp:coordinates = "time lat lon alt" ;

attributes:
  :CF\:featureType = "point";
Now I am assuming that in a TDS implementation of a service, I will be able to 
select on the coordinate variables, is that correct?  Even so, for something 
like ICOADS, obs is quite large and that extract could be quite slow unless 
either there is additional structure or the  TDS pre-fetches the coordinate 
variables much as the present Dapper server does.

Other options would be to say have a file for each 10-degree block, and then 
have TDS aggregate over the files - would this be possible.  Then the search 
would a lot faster when people want time series in a region as opposed to more 
synoptic extractions.  Would the TDS service be supporting such an option?  Or, 
as netcdf-4 supports groups, to have 10-degree groups with 2-degree subgroups, 
which would work as far as netcdf-4 is concerned, but that is not the same as 
TDS knowing what to do with the hierarchy or to take advantage of the structure.

My questions for Profiles  (that is for the WODB) are pretty much the same.  I 
assume that the TDS service will be able to search on the coordinate variables, 
is that correct?  And I have the issue with the fact that the profile dimension 
variable will get quite large and without further structure the search could be 
very slow.  Adding the same types of structures mentioned above would provide 
possible solutions, but only if TDS, as opposed to netcdf4, supported them.

As you may have guessed, these are not theoretical questions - I would really 
like to see ICOADS and WODB served as part of the year 2 UAF effort.  So now is 
a good time to start thinking about how to do it correctly and what the service 
will be able to do.

Thoughts?

Thanks,

-Roy







**********************
"The contents of this message do not reflect any position of the U.S. 
Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097

e-mail: Roy.Mendelssohn@xxxxxxxx (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440
www: http://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected" 

  • 2010 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: