With regard WODB, the users want to be able to search not just using
physical coordinates like lat-lon bounding box, but also using other
attributes like institution, project, platform etc which are usually not
represented using coordinate variables. So grouping the data based on
physical coordinates might cause performance issues when searching using
other criteria. Would it be possible to have some additional structures,
something like a database index, but without a full-fledge DBMS? Does
TDS allow such structures?
Roy Mendelssohn wrote:
Since the UAF meeting in Seattle I have been giving some thought about
how to serve some large, important datasets, such as the raw ICOADS
observations or the WODB observations. While reading over the
PointObservation Conventions proposal on the CF site, while the
proposal makes it clear how I might put data into a netcdf file, it
doesn't make clear what the interplay might be with a service in TDS,
and how a possible service might be affected by a very large dataset
without further structure.
So it seems pretty clear that the ICOADS would be points. From the
obs = 1234 ;
double time(obs) ;
time:long_name = "time of measurement" ;
time:units = "days since 1970-01-01 00:00:00" ;
float lon(obs) ;
lon:long_name = "longitude of the observation";
lon:units = "degrees_east";
float lat(obs) ;
lat:long_name = "latitude of the observation" ;
lat:units = "degrees_north" ;
float alt(obs) ;
alt:long_name = "vertical distance above the surface" ;
alt:standard_name = "height" ;
alt:units = "m";
alt:positive = "up";
alt:axis = "Z";
float humidity(obs) ;
humidity:long_name = "specific humidity" ;
humidity:coordinates = "time lat lon alt" ;
float temp(obs) ;
temp:long_name = "temperature" ;
temp:units = "Celsius" ;
temp:coordinates = "time lat lon alt" ;
:CF\:featureType = "point";
Now I am assuming that in a TDS implementation of a service, I will be
able to select on the coordinate variables, is that correct? Even so,
for something like ICOADS, obs is quite large and that extract could
be quite slow unless either there is additional structure or the TDS
pre-fetches the coordinate variables much as the present Dapper server
Other options would be to say have a file for each 10-degree block,
and then have TDS aggregate over the files - would this be possible.
Then the search would a lot faster when people want time series in a
region as opposed to more synoptic extractions. Would the TDS service
be supporting such an option? Or, as netcdf-4 supports groups, to
have 10-degree groups with 2-degree subgroups, which would work as far
as netcdf-4 is concerned, but that is not the same as TDS knowing what
to do with the hierarchy or to take advantage of the structure.
My questions for Profiles (that is for the WODB) are pretty much the
same. I assume that the TDS service will be able to search on the
coordinate variables, is that correct? And I have the issue with the
fact that the profile dimension variable will get quite large and
without further structure the search could be very slow. Adding the
same types of structures mentioned above would provide possible
solutions, but only if TDS, as opposed to netcdf4, supported them.
As you may have guessed, these are not theoretical questions - I would
really like to see ICOADS and WODB served as part of the year 2 UAF
effort. So now is a good time to start thinking about how to do it
correctly and what the service will be able to do.
"The contents of this message do not reflect any position of the U.S.
Government or NOAA."
Supervisory Operations Research Analyst
Environmental Research Division
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097
e-mail: Roy.Mendelssohn@xxxxxxxx <mailto:Roy.Mendelssohn@xxxxxxxx>
(Note new e-mail address)
"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
thredds mailing list
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/