Re: [thredds] TDS initialization

Hi John, Roland, Ethan,

I'm bumping this thread back up. Apologies for leaving it dangling for some time.

I believe we may see a handful of TDS services of this scale (> 2M files) in support of the ESG IPCC/CMIP5 activity within the next 12 months or so.

I appreciate your assistance with addressing our large scale TDS needs.

Regards,

-Eric


John Caron wrote:
Hi Eric, Nathan:

On 1/4/2011 1:29 PM, Eric Nienhouse wrote:
Hi Roland, John, Ethan,

I'm sorry for not posting this to the thredds list, which I am happy to do. However, I thought I would raise this to you first as it relates to the TDS and performance in our environment here in CISL/VETS at NCAR.

I like to post to that group so others can follow along and see if it also applies to them, so Im cc'ing there.

Thank you, John. I'm curious about other experiences with large scale TDS installations as well.


We're close to the 2 million file mark in one of our production ESG TDS servers (which supports www.earthsystemgrid.org.) I can get you the specs on our machine running this service (~2 year old AMD multi-core CentOS). In our experience, it takes about 5 minutes to initialize the TDS from the underlying thredds catalogs. There are many catlog refs, all for local catalog files, which represent about 3200 datasets over ~2800 catalog files. (I can provide more detail if you would like. The service is at: tds.ucar.edu/thredds)

Could you send me a typical config catalog, so I get a sense of what you are doing?
An example of a base-level catalog can be found at the NCAR ESG data node:

http://tds.ucar.edu/thredds/esgcet/catalog.xml

I'll send you a typical config catalog for one of the catalogs references in the above by direct email.


This service requires ~30Gb of JVM memory to successfully initialize, which is a scalability concern for us.

yes indeed


We re-init the TDS often during a new data publication process. We find after some number of re-inits (likely 50 - 200) the TDS will re-initialize *very slowly*, often taking hours to re-init. I speculate this is due to memory resources and perhaps "perm gen" space with the tomcat / JVM process and/or GC thrashing.

yes, you need to restart Tomcat when/before that happens. Apparently, Tomcat 7 may be better, but we havent tested yet.
This is my understanding as well. We're still running Tomcat6, and will likely continue to do so for the foreseeable future. As soon as we have some experience w/ 7 we'll pass it along.

BTW, in the latest TDS 4.2 reinit is a little flaky, though I expect it will work for your case. Let me know if you see problems (besides the permgen problem).

How often do you reinit?
We reinit ~ 10 times a week.


We're anticipating at least double the number of files will be served at NCAR due to CMIP5 modeling efforts over the next 18 months.

We've considered some possible solutions to the eventual, slow load such as:

1)  Restarting the TDS routinely.
2) "partitioning" TDS instances and thereby the files over multiple processes or hosts.

We're curious, too, if there may be some tuning we could do w.r.t. the TDS that may help the situation (so far we've only increased JVM heap memory.) Do you have any initial recommendations?

At the moment we dont have any tuning for this, but I think a quick fix is to add the ability to not cache the catalogs, but read them each time, maybe by setting the "expires" attribute or adding a "cache" attribute. Better would be to use an LRU cache like ehcache, but that will take longer to implement.
Thanks for these insights. I'm interested in pursuing these - please let me know what we can do to help.

This wont help the startup time that much (it will help some), mostly the memory use.

To improve startup time we need caching of the info in catalogs that dont change. Do all your catalogs get rewritten, or only the ones that change (ie can we use lastModified on the OS File to detect changes) ?
I believe only the changed catalogs are rewritten to the file system. The "root" catalog is rewritten as well (even if the catalog ref list content is unchanged.)

John



  • 2011 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: