Re: [thredds] THREDDS Garbage Collection Tuning

Hi Tom:

If you are examining the heap with MAT, id be interested in knowing what you see as the dominators. If its the catalogs, Id be interested is comparing that to a new version that doesnt cache the catalogs.

John

On 6/1/2011 11:39 AM, Tom Kunicki wrote:

For GC profiling I would also recommend VisualVM along with the VisualGC view (might be an optional plugin). We are using of TDS with the following settings:

"-server -XX:MaxPermSize=512m -XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=$CATALINA_HOME/heapdumps/default -Xmx4096m"


We do have heap dumps on OOME enabled so that we can debug OOME if they do occur (they haven't yet with our TB sized aggregations).

GC tuning takes an investment and there isn't a single set of optimal parameters. Even with a single application (like TDS) it really depends on the usage patterns of the particular installation.

Some quick notes:

1. The -XX:MaxPermSize is important with TDS, the default is 80m, you want something higher.
2.  Don't be overzealous with -Xmx.
2.1 If you are running a 32bit JRE you will be limited to ~1.5G (maybe higher with some an OS other than windows). The JRE won't start if the value is too high. 2.2 There's probably a limit with the 64bit JRE but I haven't found it yet. Be warned that large heaps can lead to large pauses as OldGen GCs can be deferred until the heap is exhausted. 2.3 Make sure the max heap doesn't exceed the amount of physical memory along with some room for the OS and other processes. You don't want to start page swapping to disk. 3. If you are willing to watch the GC with VisualVM or Eclipse MAT, watch the use of NewGen/Eden Space. Examine the per request footprint and try to keep per-request memory allocations from being pushed into OldGen. NewGen GC is cheap (but gets more expensive as it increases), OldGen GC is expensive and is what leads to the observable pauses (this is related to the note in 2.2). I don't know if TDS will benefit from this. If it would, you would want to profile per-request allocations after all the caches have loaded... You'll observe a sawtooth pattern, you want to make the the top of the sawtooth isn't clipped and pushed into OldGen.

Many more details here. http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html.

Tom


On Jun 1, 2011, at 11:53 AM, john caron wrote:

On 6/1/2011 9:57 AM, Rich Signell wrote:
Ethan,

According this this article:
http://www.petefreitag.com/item/746.cfm
it says

the "OutOfMemoryError - GC overhead limit exceeded" exception is
thrown by the garbage collector when it is spending way too much time
collecting garbage. This error essentially means that you need to add
more memory, or reconfigure your garbage collection arguments.

So it seems like the same issue.    Our settings are:
export JAVA_OPTS="-XX:MaxPermSize=256M -Xmx2048m -Xms512m -server
-Djava.awt.headless=true"

-Rich

"OutOfMemoryError - GC overhead limit exceeded" means you dont have enough heap (-Xmx2048m). The MaxPermSize is different and wont directly affect this problem.

If this is a site with many catalogs, I have an alpha version of TDS that is ready for testing that does not cache catalogs, and will probably fix your problem. If you want to try this out, contact me off list.

Otherwise, get a memory heap dump and put it on an ftp or web server (they tend to be very large) and send me the URL, so we can see why you are running out of memory. Google "how to get a Java heap dump" to see how. If you want to sink your teeth into this problem, the Eclipse MAT tool is the one to use.


_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx <mailto:thredds@xxxxxxxxxxxxxxxx>
For list information or to unsubscribe, visit: http://www.unidata.ucar.edu/mailing_lists/


_______________________________________________
thredds mailing list
thredds@xxxxxxxxxxxxxxxx
For list information or to unsubscribe,  visit: 
http://www.unidata.ucar.edu/mailing_lists/