Re: Thredds out of memory

To: Tennessee Leeuwenburg <t.leeuwenburg@xxxxxxxxxx>
Subject: Re: Thredds out of memory
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Wed, 30 Mar 2005 10:17:53 -0700

Tennessee Leeuwenburg wrote:

I turned the java heap size up to 1024m, and it was able to handle my580Mb file.
Here's another question about the internals :
As some of you know, I have written a servlet which serves up NetCDFfiles via HTTP, retrieved and converted from a database. This worksgreat for small data sets (<60Mb) but is behaving strangely for thelarge (580Mb) dataset.
Serving the file through apache, it takes, oh, a few minutes to getthe DODS file onto my hard disk. Say 10 minutes, and that would beplenty.

so if i understand, you have a client that requests the file via HTTP,then just copies it to a file on disk ?

The servlet seems to take much longer. In terms of raw throughput whendownloading from HTTP via Firefox, I get about 1.8Mb/s from apache, vsabout 1.5 from my servlet. That's not a *huge* difference, and it'sprobably related to window size or something.
When I connect THREDDS to apache, there is a latency while the file isdownloaded from apache, followed by throughput of about 1Mb/s and aslight reduction in file size.

When I connect THREDDS to my servlet, the initial latency is at least10 minutes (he says waiting for the download to start). I found this alittle weird, so I included some debugging in my servlet so I couldwatch the contents of each packet. I'm serving the data in 8192bytechunks, possible not the quickest way to go about it. What I see is agenerally increasing byte range being served, but occasionally, bytesfrom earlier in the file are served. This seems a bit weird to me. Iguess thredds is "going back" and looking things up in order tore-factor the data structure, but I want to make sure this is expectedbehaviour and that nothing nuts is going on.

what do you mean "connect THREDDS to apache" or" my servlet" ? TheTHREDDS data viewer?

generally a netcdf client like the thredds data viewer will treat thefile as random access, and so may skip around in the file. if all you dois read the file sequentially, HTTP is ok. but for random access it canbe really slow. Opendap is much better in this case.

I am trying to work out how to redress the situation. One easy thingto test is to vary the window size to a much larger number, say 500Kbor even megabytes. I could possibly alter this on the basis of thefile size, or try to come up with some dynamic regime for altering thewindow size.


depends on your data access pattern.

Is there a "magic number" in thredds which is a best window size touse? Would it "prefer" to get its data in any particular way? Threddsis basically the only client for this servlet, so I will just tune itfor best performance.


what do you mean by "window size" ?

Or maybe it's just some inefficiency in java's random-access - if it'sa separate request every time, maybe there's even a new instancehandling each request and I'm getting bogged down in object creation.Now there's a thought! If that's the case, I'll have to implement somekind of static object containing the currently open files to avoidre-opening them...
Feedback welcome. Sorry to abuse the list for hair-brained developerquestions. Maybe one day I'll be able to do something useful for you...
Download still waiting...

Cheers,
-Tennessee

Follow-Ups:
- Re: Thredds out of memory
  - From: Tennessee Leeuwenburg

References:
- Re: Thredds out of memory
  - From: John Caron

2005 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-java archives: