[netcdf-java] Reading very large THREDDS catalogs...

To: netcdf-java@xxxxxxxxxxxxxxxx
Subject: [netcdf-java] Reading very large THREDDS catalogs...
From: Roland Schweitzer <Roland.Schweitzer@xxxxxxxx>
Date: Thu, 22 Sep 2011 09:13:18 -0500

Hi,

Some folks at NCAR have put together a THREDDS catalog(http://tds.prototype.ucar.edu/thredds/esgcet/catalog.xml) which I wouldlike read to prepare configuration information for LAS. The catalogconsists of 3000+ catalogRef elements that point to other localcatalogs. When running through this catalog doing the obvious thing:


            List<InvDataset> datasets = catalog.getDatasets();

for (Iterator<InvDataset> iterator = datasets.iterator();iterator.hasNext();) {

                InvDataset invDataset = (InvDataset) iterator.next();
                System.out.println("\t"+invDataset.getName());
            }

the JVM heap gets larger when each successive dataset (catalogRef) isread as observed by setting the options to log the garbage collection onthe JVM. This makes sense in that the catalogRef gets read and theinformation gets kept in memory. The problem is that eventually youwill run out of heap. When you run out depends on how much memory yougive the JVM.

If folks are going to be publishing catalogs this large, we need someway to read them in a memory efficient way. I know that once I reachthe bottom of the loop I'm finished with that dataset and it would be okwith me to boot it out of memory, but I haven't figured out a clever wayto do that.

What are the options for reading such a large catalog using theJava-netCDF tools?


Roland

Follow-Ups:
- Re: [netcdf-java] Reading very large THREDDS catalogs...
  - From: Roland Schweitzer

2011 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-java archives: