Re: Thredds out of memory

To: Tennessee Leeuwenburg <t.leeuwenburg@xxxxxxxxxx>
Subject: Re: Thredds out of memory
From: John Caron <caron@xxxxxxxxxxxxxxxx>
Date: Thu, 31 Mar 2005 08:24:39 -0700

Tennessee Leeuwenburg wrote:

Not quite. I have something called a MARS database. This is anobject-oriented database, whose only access is via a compiled Cprogram, and which does not support any kind of network request.
I wrote a Java servlet which parses the URL, extracts queryinformation, runs the query against the database, converts theresulting GRIB file to NetCDF, and serves that back to the user. Inthis case, Thredds.

This, to the client, is (should be) invisible compared with requestinga NetCDF file via HTTP from any source, such as a web server like apache.
Thredds is capable of sourcing its data from HTTP sources as opposedto files on the local disk. My configuration file looks a littlesomething like this :
<!DOCTYPE catalog SYSTEM"http://www.unidata.ucar.edu/projects/THREDDS/xml/AggServerCatalog.dtd";><catalog name="THREDDS - DODS Aggregation Server Catalog"version="0.6" xmlns="http://www.unidata.ucar.edu/thredds";xmlns:xlink="http://www.w3.org/1999/xlink";>
   <dataset name="Top-Level Dataset" dataType="Grid" serviceName="this">
       <service name="this" serviceType="DODS" base=""/>
<service name="apache" serviceType="NetCDF"base="http://kahless.ho.bom.gov.au/"/><service name="marslet" serviceType="NetCDF"base="http://kahless.ho.bom.gov.au:8080/marslet/"/>
       <dataset name="Large Internal Marslet" serviceName="this">
           <property name="internalService" value="marslet"/>
           <dataset name="Surface Data" urlPath="verylarge.nc"/>
       </dataset>
             <dataset name="Large Internal Apache" serviceName="this">
           <property name="internalService" value="apache"/>
           <dataset name="Surface Data" urlPath="laps-levels-large.nc"/>
       </dataset>

   </dataset>
</catalog>
For small files, this actually works. For larger files, theinteraction with the servlet breaks somehow, however the file sourcesfrom apache works okay.
I don't understand why this is the case. Software such as wget,firefox etc is happily able to download the file, resume partialdownloads etc. Thredds is happily able to get smaller files. I fail tosee why file size is affecting the system so badly.
generally a netcdf client like the thredds data viewer will treat thefile as random access, and so may skip around in the file. if all youdo is read the file sequentially, HTTP is ok. but for random accessit can be really slow. Opendap is much better in this case.
That's exactly the goal - we want to use Opendap to give data to thevarious software clients that will use the data, in order to gain themany advantages offered. However, we have to get the files IN tothredds somehow.

yes, but the thredds AS server is a client of your HTTP server. When theAS server gets a request, it skips around the HTTP file to read it. Soit depends what request the AS server gets, as to what its accesspattern is.

try giving it very simple requests that are contiguous in the file andof known, reasonable size. those should work ok. then increase thesize/complexity of your request and see where it degrades.

Is there any way you can give it access to the file directly, likethrough an NFS mount?

Is there a "magic number" in thredds which is a best window size touse? Would it "prefer" to get its data in any particular way?Thredds is basically the only client for this servlet, so I willjust tune it for best performance.
what do you mean by "window size" ?
When I'm serving data from my servlet, I create an 8k buffer whichreads data from disk, then is flushed to the output stream.
               // Sent in 8 byte chunks
byte[] dataBuf = new byte[8192]; //we'll read 8Kchunks in.seek(0L);
               in.skipBytes(firstByte);
               int length = 0;
                             long bytecount = 0;
while(in != null && (length =in.read(dataBuf,0,dataBuf.length)) != -1) {if(debug) servletContext.log("doGet() valid:serving bytes " + bytecount + " to " + length);
                   bytecount = bytecount + length;
                   out.write(dataBuf, 0, length);
               }

I have attached the full code for your interest.

you will see better performance as you increase this buffer size, at thecost of needing more heap space.

you should also tune the buffer size in HTTPRandomAccessFile, probablymatching the sizes would be best.


Cheers,
-Tennessee

------------------------------------------------------------------------

import javax.servlet.*;
import javax.servlet.http.*;
import java.io.*;
import java.util.*;

public class Marslet extends HttpServlet
{

boolean debug=true;/**

    * This returns the header only for the equivalent GET request. It may be
    * used by some clients to establish file-sizes or otherwise make use
    * of summary information before performing a GET request. The HEAD
    * response may not include a message-body.
    */

protected void doHead(HttpServletRequest request, HttpServletResponse response)throws ServletException, IOException

HttpSession session = request.getSession();

       ServletContext servletContext = getServletContext();
       ServletOutputStream out = response.getOutputStream();
       File ncFile = doMarsQuery(request);

// Abort if file cannot be found

       if(ncFile == null) {
           servletContext.log("Null netCDF file - cannot continue");
           out.println("Error in processing - database request failed. Please 
contact the administrator tjl@xxxxxxxxxx");
           return;
       }

String filename = ncFile.getPath();

       RandomAccessFile in = null;
       String contentType = "application/x-netcdf";

servletContext.log("Found file, processing HEAD request");try {


           long filesize = ncFile.length();
           if(debug) servletContext.log("doHead(): filesize is "+ filesize);

String rangeHeader = request.getHeader("range");// Behave differently if request is bad for some reason

           if(!isRangeHeaderValid(rangeHeader, filesize)) {

// Bad numbers in range, log error

               if(rangeHeader != null && rangeHeader != "") {

servletContext.log("*** Invalid byte range header, sending entire file: " + rangeHeader);}servletContext.log("Sending entire file");//headEntireFile(response, out, ncFile, contentType, servletContext);FileInputStream inStream = null;inStream = new FileInputStream(ncFile);

               int length = (int)ncFile.length();

response.setStatus(response.SC_PARTIAL_CONTENT);response.setHeader("Accept-Ranges", "bytes");

               response.setContentLength(length);
               out.println("Content-Length: " + length);
               response.setContentType(contentType);

out.flush();

               if(debug) servletContext.log("doHead() invalid: " + response.SC_PARTIAL_CONTENT + 
"Accept-Ranges: bytes" + "Content-Length: " + length + "Content-Type" + contentType);

}// If a valid byte range has been requested

           else {
               in = new RandomAccessFile(ncFile, "r");

// byte range variables

               int firstByte = 0;
               int lastByte = 0;
               int nBytes = 0;

StringTokenizer st = getRangeAsTokens(rangeHeader);

               String range = "";

range = st.nextToken();

               range = range.trim();

// Determine firstByte

               if(range.indexOf('-') == 0) { firstByte = -1; }
               else { firstByte = (new Integer(range.substring(0, 
range.indexOf('-')))).intValue(); }

// Determine lastByte

               if(range.indexOf('-') == range.length() - 1) { lastByte = -1; }
               else { lastByte = (new 
Integer(range.substring(range.indexOf('-') + 1))).intValue(); }

// If firstByte < 0, then client wants from there to EOFif( firstByte < 0 ) {

                   firstByte = (int) filesize - lastByte;
                   lastByte = (int) filesize - 1;
               }

// If last byte < 0, set to EOF

               if(lastByte < 0) { lastByte = (int) filesize - 1; }
               nBytes = (lastByte = firstByte) + 1;

////////////////////////////////////////

               // Send the headers, do not write data
               ////////////////////////////////////////

String contentRange = "bytes " + firstByte + "-" + lastByte + "/" + filesize;response.setStatus(response.SC_PARTIAL_CONTENT);

               response.setHeader("Accept-Ranges", "bytes");
               //response.setContentLength(nBytes);
               out.println("Content-Length: " + nBytes);
               response.setContentType(contentType);
               response.setHeader("Content-Range", ""+contentRange);

if(debug) servletContext.log("doHead() valid: " + response.SC_PARTIAL_CONTENT + "Accept-Ranges: bytes" + "Content-Length" + nBytes + "Content-Type" + contentType + "Content-Range: " + contentRange);}out.flush();

           out.close();
       }

catch(Exception e) {servletContext.log("Catching: " + e.toString(), e);e.printStackTrace();

finally {try { if(in != null) { in.close(); } }

           catch( Exception e2) { servletContext.log( "Finally: " + 
e2.toString(), e2); }
       }

}protected void doGet(HttpServletRequest request, HttpServletResponse response)

   throws ServletException, IOException
   {

HttpSession session = request.getSession();

       ServletContext servletContext = getServletContext(); // For logging
       ServletOutputStream out = response.getOutputStream();
       File ncFile = doMarsQuery(request);

if(ncFile == null) {

           servletContext.log("NULL netCDF file - cannot continue");
           out.println("Error in processing - database request failed. Please 
contact the administrator tjl@xxxxxxxxxx");

return;}

       String filename = ncFile.getPath();
       RandomAccessFile in = null;
       String contentType = "application/x-netcdf";

       servletContext.log("doGet(): Found file, processing GET request");

try {Enumeration e = request.getHeaderNames();

           while(e.hasMoreElements()) {
               String headerName = (String) e.nextElement();

Enumeration e2 = request.getHeaders(headerName);

               String requestStr = "";
               while(e2.hasMoreElements()) {
                   String headerValue = (String) e2.nextElement();
                   requestStr = requestStr + "Request> " + headerName + ": " + 
headerValue + "\n";
               }
               servletContext.log(requestStr);
           }


           long filesize = ncFile.length();

String rangeHeader = request.getHeader("range");//If a bad byte-range has been requested

           if(!isRangeHeaderValid(rangeHeader, filesize)) {

// Bad numbers in range, log error

               if(rangeHeader != null && rangeHeader != "") {
                   servletContext.log("*** Invalid byte range header, sending entire 
file: " + rangeHeader);
               }

if(debug) servletContext.log("doGet() invalid: Sending entire file");

               sendEntireFile(response, out, ncFile, contentType, 
servletContext);
           }

// If a valid byte range has been requested

           else {

               in = new RandomAccessFile(ncFile, "r");

// byte range variables

               int firstByte = 0;
               int lastByte = 0;
               int nBytes = 0;

StringTokenizer st = getRangeAsTokens(rangeHeader);

               String range = "";

///////////////////////////////////////////

               // Note - the original code snippet I found handled multiple

// byte-range requests, but this isn't actually correct.// When handling byte-range requests, you must not have multipart

               // responses, but simply serve the data requested. (See HTTP 1.1 
Specification)
               // Presumably this means you can only have one byte-range 
request in the

// request header, thus this code retrieves only the first token for// byte-range

               ///////////////////////////////////////////

range = st.nextToken();

               range = range.trim();

// Determine firstByteif(range.indexOf('-') == 0){ firstByte = -1; } // range format is "-lastbyte"else{ firstByte = (new Integer(range.substring(0, range.indexOf('-')))).intValue(); }// Determine lastByteif(range.indexOf('-') == range.length() - 1){ lastByte = -1; }//range format is "firstbyte-"else{ lastByte = (new Integer(range.substring(range.indexOf('-') + 1))).intValue(); }// If first or last byte < 0, then client wants from there to EOF

               if(firstByte < 0) {
                   firstByte = (int) filesize - lastByte;
                   lastByte = (int) filesize - 1;
               }

// If last byte is < 0, set to EOF

               if(lastByte < 0) { lastByte = (int) filesize - 1; }
               nBytes = (lastByte - firstByte) + 1;

////////////////////////////////////////////// Send the headers and start writing data////////////////////////////////////////////String contentRange = "bytes "+ firstByte + "-" + lastByte + "/" + filesize;response.setStatus(response.SC_PARTIAL_CONTENT);

               response.setHeader("Accept-ranges", "bytes");
               response.setContentType(contentType);
               response.setHeader("Content-Range", ""+contentRange);
               response.setContentLength(nBytes);

String responseStr = "";

               responseStr = responseStr + "Response> Status: " + 
response.SC_PARTIAL_CONTENT + "\n";
               responseStr = responseStr + "Response> Accept-ranges: bytes\n";
               responseStr = responseStr + "Response> Content-Type: " + contentType + 
"\n";
               responseStr = responseStr + "Response> Content-Range: " + contentRange + 
"\n";
               responseStr = responseStr + "Response> Content-length: " +nBytes + 
"\n";

if(debug) servletContext.log("doGet(): valid \n" + responseStr);//out.println("Content-Length: " + nBytes);


               // How it used to be - suspect of causing a buffer overflow

//byte[] dataBuf = new byte[8192]; //we'll read 8K chunks//byte[] dataBuf = new byte[nBytes + 1];

               //in.seek(0L);
               //in.skipBytes(firstByte);
               //int length = in.read(dataBuf, 0, nBytes);
               //out.write(dataBuf, 0, length);

               // Sent in 8 byte chunks

byte[] dataBuf = new byte[8192]; //we'll read 8K chunksin.seek(0L);

               in.skipBytes(firstByte);
               int length = 0;

long bytecount = 0;

               while(in != null && (length = in.read(dataBuf,0,dataBuf.length)) 
!= -1) {
                   if(debug) servletContext.log("doGet() valid: serving bytes " + 
bytecount + " to " + length);
                   bytecount = bytecount + length;
                   out.write(dataBuf, 0, length);

}out.flush();}out.close();}// Ignore some client-caused exceptions

       catch(java.io.IOException ioe) {

           // Ignore "Connection reset by peer" exceptions which can be cause by
           // a number of reasons attributable to the client. They are generally
           // harmless and out of our control. Log others.

//if(ioe.toString().compareToIgnoreCase("java.io.IOException: Connection reset by peer") != 0) {

               servletContext.log(ioe.toString(), ioe);
           //}
       }

// Log generic exceptions

       catch(Exception e) {

servletContext.log(e.toString(), e);}// Try to close the file if it's still open

       finally {

           try {

if(in != null) { in.close(); }}

           catch(Exception e2) {

servletContext.log(e2.toString());}}}/**

    * Interpret the GET variables into a mars request and execute
    */

private File doMarsQuery(HttpServletRequest request) {try{

           return new File("/data/laps-levels-large.nc");
       }
       catch(Exception e) {
           getServletContext().log(e.getMessage());
           e.printStackTrace();
       }

return null;

private File getFromCache(String requestString) {

       String tmpDirName = "/nm/scratch/marslet/";

File tmpDir = new File(tmpDirName);int hash = requestString.hashCode();

       File ncFile = new File(tmpDir, "marslet" + requestString.hashCode() + 
".nc");
       if(ncFile.exists()) { return ncFile; } else { return null; }
   }

private void addToCache(File ncFile) {// Do nothing - no accounting just yet}/**

    * Send the entire file to the client
    * <p>
    * @param HttpServletResponse the response object
    * @param HttpServletRequest the request object
    * @param File the file we're sending
    * @param String Content-type: header value
    */
   private void sendEntireFile(HttpServletResponse response,
                               ServletOutputStream out,
                               File file,
                               String contentType,
                               ServletContext servletContext
                               )
   throws IOException, Exception
   {

       FileInputStream inStream = null;

try {

           inStream = new FileInputStream(file);
           int length = 0;
           response.setHeader("Accept-Ranges", "bytes");
           response.setContentType(contentType);
           response.setContentLength((int)file.length());
           //out.println("Content-Length: " + (int)file.length());

byte[] buf = new byte[8192]; //we'll read 8K chunks

           while(inStream != null && (length = inStream.read(buf,0,buf.length)) 
!= -1) {

out.write(buf, 0, length);}

       }
       catch(IOException ioe) {

throw ioe;}

       catch(Exception e) {
           throw e;
       }
       finally {
           if(inStream != null) {

inStream.close();}

}/**

    * Validate the byte range request header
    * The following byte range request header formats are supported:
    * <ul>
    *   <li> firstbyte-lastbyte (request for explicit range)
    *   <li> firstbyte- (request for 'firstbyte' byte to EOF)
    *   <li> -lastbyte (request for 'lastbyte' byte to EOF)
    * </ul>
    * @param String the byte range header
    * @return boolean true=valid header, false=invalid header
    */

private boolean isRangeHeaderValid(String rangeHeader, long filesize){if(rangeHeader == null || rangeHeader.equals("")) { return false; }String range = "";

       int firstbyte = 0;
       int lastbyte = 0;

StringTokenizer st = getRangeAsTokens(rangeHeader);

       while(st.hasMoreTokens()) {
           range = st.nextToken();
           range = range.trim();

int index = range.indexOf('-');if( index == -1 ) { return false; } //Illegal: must contain a '-'

           if(range.length() <= 1) { return false; } //Illegal = musthave more 
than '-'

//Case -lastbyte

           if(index == 0) {
               lastbyte = (new Integer(range.substring(range.indexOf('-') +1 
))).intValue();
               if(lastbyte > filesize) { return false; }
               else continue;
           }

//Case firstbyte-

           if(index == range.length() -1) {
               firstbyte = (new 
Integer(range.substring(0,range.indexOf('-')))).intValue();
               if( firstbyte > filesize) { return false; }
               else { continue; }
           }

//Case firstbyte=lastbyte

           if(index != 0 && index != range.length() -1) {
               firstbyte = (new 
Integer(range.substring(0,range.indexOf('-')))).intValue();
               lastbyte = (new Integer(range.substring(range.indexOf('-') + 
1))).intValue();

if(firstbyte > lastbyte) { return false; }

               else { continue; }
           }
       }

return true;

/**

    * Break the range header into token. Each token represents a single 
requested range.
    *
    * The most common tange header formate is ...
    *
    * range = firstbyte-lastbyte,firstbyte-lastbye
    *
    * ... with one ofr more firstbyte-lastbyte values, all comma separated
    *
    * @param String the byte range header

* @return StringTokenizer the tokenized string*/private StringTokenizer getRangeAsTokens(String rangeHeader)

   {
       String ranges = rangeHeader.substring(rangeHeader.indexOf('=') + 1, 
rangeHeader.length());
       return new StringTokenizer(ranges, ",");
   }

/**

    * Handle HTTP post requests
    */

public void doPost(HttpServletRequest request, HttpServletResponse response)

    throws ServletException, IOException
    {
        return;
    }
}

Follow-Ups:
- Re: Thredds out of memory
  - From: Tennessee Leeuwenburg

References:
- Re: Thredds out of memory
  - From: John Caron
- Re: Thredds out of memory
  - From: John Caron
- Re: Thredds out of memory
  - From: Tennessee Leeuwenburg

2005 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the netcdf-java archives: