Re: [thredds] TDS initialization


On 01/04/2011 10:39 AM, Roy Mendelssohn wrote:
Hi Roland:

Before I respond. can you give a rough idea of your metric for "big" is  (or 
how many dataset elements and catalogRefs).

I did some counts. As best I as I can tell by looking at the various logs, it takes TDS about 2 seconds to get up and running with the GEOIDE catalog. The counter which recursively counts at THREDDS datasets and counts those which can be accessed says this:

    4087 total data sets found.
    2197 have access.

There is some hierarchy in the base catalog, but all of the data sets are in catalogs which are included in the main catalog via about a dozen catalogRefs. None of the catalogRefs refer to remote catalogs over http.

Another example:

    4931 total data sets found.
    3882 have access.

There are no catalogRefs in this catalog. I don't know how long it takes the server to start, but I can find out.

For comparison, I ran the same counter code on this catalog.

    669 total data sets found.
    443 have access.



On Jan 4, 2011, at 8:35 AM, Roland Schweitzer wrote:

Thanks John.  Among the groups we collaborate with there are some folks that 
are quite concerned about the scaling issue.  Personally, my direct experience 
at this point that indicates that the performance is just fine (at least so 
far) even with our largest catalogs.

What's the experience of the list?  Are folks seeing unacceptable TDS 
initialization because of time spend reading catalogs?  The thread from John 
Maurer about aggregation access issues notwithstanding.


On 01/03/2011 07:34 PM, John Caron wrote:
On 1/3/2011 10:53 AM, Roland Schweitzer wrote:

We're starting to put together some "big" server-side configuration catalogs (both with 
"lots" of dataset elements and "lots" of catalogRef elements).  We are wondering about 
the process TDS goes through to read the catalog when is starts.  What gets cached?  Does it have a way to 
know a referenced catalog is unchanged?  When do referenced catalogs get scanned?  And so on.

Is there some documentation or a flow chart on how TDS initializes itself?


thredds mailing list
For list information or to unsubscribe,  visit:
Hi Roland:

The sad answer is theres not much documentation. Weve been on the verge of 
redoing the initialization sequence for a few years now, so weve been waiting 
so we can document the clean, cool refactor instead of the crufty, lame current 

Anyway, the TDS reads in all the config catalogs at startup. It caches all of them, and 
uses the "expires" attribute on the catalog to decide if/when it needs to 
reread a catalog.  It needs to read all catalogs, including catalogRef, because it has to 
know what the possible dataset URLs are, and there is no contract that a client has to 
read a catalog first.

Obviously this doesnt scale forever. Ethan can probably fill in some details.



thredds mailing list
For list information or to unsubscribe,  visit:
thredds mailing list
For list information or to unsubscribe,  visit:
"The contents of this message do not reflect any position of the U.S. Government or 
Roy Mendelssohn
Supervisory Operations Research Analyst
Environmental Research Division
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097

e-mail: Roy.Mendelssohn@xxxxxxxx (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"

  • 2011 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: