TDS Tutorial: TDS Monitoring and Debugging

Tomcat Access Logs

The Tomcat access log

Log!
  • The access log records all requests processed by the server.
  • Not enabled in Tomcat by default - must be enabled in $CATALINA_HOME/conf/server.xml.
  • Information it contains is different from other logs in $CATALINA_HOME/logs.
  • Used for monitoring who is using your server and as a way of obtaining "feedback" about the activity and performance of the server.

Configuring Tomcat to enable access logging

  1. Modify server.xml to enable the AccessLogValve.
  2. Using your favorite editor open $CATALINA_HOME/conf/server.xml:

    $ vi server.xml
    

    Locate the AccessLogValve contained in the Host element (should be near the bottom of the file):

    <!-- Define the default virtual host
            Note: XML Schema validation will not work with Xerces 2.2.
    -->
    <Host name="localhost"  appBase="webapps"
          unpackWARs="true" autoDeploy="true"
          xmlValidation="false" xmlNamespaceAware="false">
    
      . . .
      
      <!-- Access log processes all example.
              Documentation at: /docs/config/valve.html -->
      <!--
      <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs"  
                prefix="localhost_access_log." suffix=".txt" pattern="common" resolveHosts="false"/>
      -->
    
    </Host>
    

    Change the prefix and suffix attributes to customize the access log name. (We are customizing the name of the access log for the sake of a future exercise that requires the log files to begin with access. In reality, you can modify these attributes to your own chosen values or leave defaults when you configure Tomcat at your home institution.)

    prefix="access." 
    suffix=".log"
    

    To provide more useful information about who is accessing the TDS, change the pattern element to customize the format of each log entry:

    pattern="%h %l %u %t &quot;%r&quot; %s %b &quot;%{Referer}i&quot; &quot;%{User-Agent}i&quot; %D"
    

    When you are finished with your edits, the AccessLogValve should look something like the following:

    <Valve className="org.apache.catalina.valves.AccessLogValve"
       directory="logs"  
       prefix="access." 
       suffix=".log"
       pattern="%h %l %u %t &quot;%r&quot; %s %b &quot;%{Referer}i&quot; &quot;%{User-Agent}i&quot; %D"
       resolveHosts="false" />
    
  3. Verify the access log has been enabled.
  4. Restart Tomcat and verify an access log has been generated in the $CATALINA_HOME/logs/ directory:

    $ ls -l /home/tds/tds/apache-tomcat-6.0.32/logs
    total 52
    -rw-r--r-- 1 tds   Unidata  20034 Jul 19 16:38 catalina.2011-07-19.log
    -rw-r--r-- 1 tds   Unidata  20551 Jul 19 16:38 catalina.out
    -rw-r--r-- 1 tds   Unidata      0 Jul 19 16:31 host-manager.2011-07-19.log
    -rw-r--r-- 1 tds   Unidata   4537 Jul 19 16:38 localhost.2011-07-19.log
    -rw-r--r-- 1 tds   Unidata      0 Jul 19 16:38 access.2011-07-19.log
    -rw-r--r-- 1 tds   Unidata      0 Jul 19 16:31 manager.2011-07-19.log
    

Exercise One: Enable the access logging in Tomcat

  1. Using the steps outlined in the above section, enable access logging in Tomcat via the AccessLogValve.

Access log format

The access log entry format we are using is almost identical to the standard combined logging format with an addition: the %D which is used for documenting the Time taken to process the request, in millis will appear at the end of each log entry:

pattern="%h %l %u %t &quot;%r&quot; %s %b &quot;%{Referer}i&quot; &quot;%{User-Agent}i&quot; %D"

Access log format

For more information on access log format configuration, see the Tomcat Valve Component Documentation.

The above pattern makes use of the following codes:

  • %h - Remote host name (or IP address if resolveHosts is false)
  • %l - Remote logical username from identd (always returns '-')
  • %u - Remote user that was authenticated (if any), else '-'
  • %t - Date and time, in Common Log Format
  • %r - First line of the request (method and request URI)
  • %s - HTTP status code of the response
  • %b - Bytes sent, excluding HTTP headers, or '-' if zero
  • %D - Time taken to process the request, in millis

The above pattern translates into:

127.0.0.1 - admin [19/Jul/2011:16:44:31 -0700] "GET /manager/html/ HTTP/1.1" 200 13740 "http://localhost:8080/" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.7) Gecko/20100723 Fedora/3.6.7-1.fc13 Firefox/3.6.7" 25

Another way of looking at it:

Remote host 127.0.0.1
Remote logical username from identd -
Authenticated user admin
Time and date of request [19/Jul/2011:16:44:31 -0700]
HTTP request method GET
Request URI /manager/html
Protocol used HTTP/1.1
HTTP server response 200
Bytes transferred 13740
Referer http://localhost:8080/
User Agent Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.7) Gecko/20100723 Fedora/3.6.7-1.fc13 Firefox/3.6.7
Response time (in millis) 25

Looking at access log data

Storing the information in the access log is only the start of log management. The next step is to analyze this information to produce useful statistics.

  1. Looking at raw access log data.
  2. Locate the access.2011-07-17.log file in your home directory which contains an access log file taken from one of our production servers.

    $ cd
    $ ls -l access* 
    -rw-r--r-- 1 tds   Unidata 344507 Jul 19 15:25 access.2011-07-17.log
    

    Have a look at the log file and try to examine the data stored there. (Kind of hard to sort through it all, isn't it)?

    $ less access.2011-07-17.log
    
  3. Analyzing data subsets with awk and dig.
  4. Sometimes we don't need to see the "whole picture" and just need some quick statistics. E.g., we need to find the host that has made the most requests to the server.

    Run the following command:

    $ cat access.2011-07-17.log | awk '{print $1}' | sort | uniq -c | sort -n
    

    The output of the access log file is piped into awk, which prints the first "field variable" $1. The result is sorted, identical lines are counted, and finally sorted again in descending order.

    192.12.209.243 is the host that has made the most requests to the server during this particular access log's period.

    But who is 192.12.209.243? (IPs are not too meaningful by themselves.)

    Run dig to find out the host information for 192.12.209.243:

    $  dig +short -x 192.12.209.243
    harper1.cise-nsf.gov.
    

    harper1.cise-nsf.gov is the host name for 192.12.209.243 (a bit more meaningful).

    Looking at the other IP addresses from the awk output. Without running dig, what do you think the domain of 192.12.209.63 is going to be?

    See if your assumption is correct:

    $  dig +short -x 192.12.209.63
    cise63.cise-nsf.gov.
    

    You can use the awk command sequence to find out if any authenticated users were responsible for making these requests.

    $ cat access.2011-07-17.log | awk '{print $3}' | sort | uniq -c | sort -n
    

Exercise Two: Using parsed access log data.

  1. Parse log data.
  2. Run parse_logs.pl file in the thredds home directory to parse the access.2011-07-17.log file:

    $ cd
    $ ./parse_logs.pl
    

    The resulting file parsed_access_log.csv will contain the log data in tab-delimited format:

    $ ls -l *.csv 
    -rw-r--r-- 1 tds   Unidata 332843 Jul 19 16:35 parsed_access_log.csv
    
  3. Examine the log data.
  4. Open the parsed_access_log.csv file in the OpenOffice spreadsheet utility.

    What Requested URL took the most time to process?

    Response Codes

    See HTTP Status Code Definitions for a reference to server response codes.

    What types of Request Methods are recorded? What type of server Response/Status Codes did these Request Methods return?

    Are there any entries that might be of interest from a security point of view? Is there a common data element among these entries? What do you think caused these entries?

Log Files Generated by the TDS

Introducing the TDS logging directory: $CATALINA_HOME/content/thredds/logs

Keep in Mind

There are two locations of log files you will reference:

  1. $CATALINA_HOME/logs/ is where you go to look at Tomcat-generated log files; and
  2. the TDS logging directory which is where you go to look at TDS-generated log files.

The log files generated by the TDS are located in the $CATALINA_HOME/content/thredds/logs directory:

Log file Data the file contains
threddsServlet.log contains log messages used for debugging
serverStartup.log contains log messages from the initialization of the TDS
models.log contains log messages specific to gridded model data, especially GRIB data
cache.log contains log messages related to TDS object and file caching

About the threddsServlet.log

  • Enabled by default.
  • Log messages for all requests handled by the TDS.
  • By default is configured to rotate every hour. Old log files are re-named with the threddsServlet.log.<yyyy-mm-dd-HH> pattern.
  • Used for debugging and troubleshooting issues with the TDS.
  • May be asked to send information from threddsServlet.log to the TDS development team when seeking support.

threddsServlet.log log format

  1. Look at the logging configurations for threddsServlet.log.
  2. Navigate to the unpacked thredds directory in $CATALINA_HOME/webapps, and view the WEB-INF/log4j.xml:

    $ cd /home/tds/tds/apache-tomcat-6.0.32/webapps/thredds
    $ less WEB-INF/log4j.xml
    

    The TDS uses the OpenSource Apache Log4J library (hence, the name of the file: WEB-INF/log4j.xml).

    Find the first uncommented Appender element with a name of threddsServlet and a class of org.apache.log4j.DailyRollingFileAppender, and note the ConversionPattern:

      <appender name="threddsServlet" class="org.apache.log4j.DailyRollingFileAppender">
        <param name="File" value="${tds.log.dir}/threddsServlet.log"/>
        <param name="DatePattern" value=".yyyy-MM-dd-HH"/>
        <layout class="org.apache.log4j.PatternLayout">
          <param name="ConversionPattern" value="%d{yyyy-MM-dd'T'HH:mm:ss.SSS Z} [%10r][%8X{ID}] %-5p - %c - %m%n"/>
          <!--param name="ConversionPattern" value="%d{ISO8601} [%10r - %10X{ID}] %-5p - %c - %m%n"/-->
        </layout>
      </appender>     
    
  3. Examine the log format being applied to the threddsServlet.log file.
  4. The ConversionPattern parameter specifies the Log4J pattern used to format the entries of the threddsServlet.log file:

    value="%d{yyyy-MM-dd'T'HH:mm:ss.SSS Z} [%10r][%8X{ID}] %-5p - %c - %m%n" 
    

    Log4J library

    For more information on the Log4J library, consult the online documentation.

    The above pattern makes use of the following Log4J codes:

    • %d{yyyy-MM-dd'T'HH:mm:ss.SSS Z} - Date and time
    • %10r - Time, in millis, since Tomcat startup
    • %8X{ID} - Transaction ID
    • %-5p - Logging level (options are DEBUG, INFO, WARN, ERROR, and FATAL)
    • %c - Location in the TDS code where the message was generated
    • %m - Content of the log message
    • %n - Newline character

    The above pattern translates into:

    2011-07-19T16:03:29.055 -0700 [     30146][      13] INFO  - thredds.server.catalogservice.LocalCatalogServiceController - handleRequestInternal(): Remote host: 127.0.0.1 - Request: "GET /thredds/catalog/fmrc/NCEP/GFS/Puerto_Rico_191km/catalog.html HTTP/1.1"
    

    Another way of looking at it:

    Time and date 2011-07-19T16:03:29.055 -0700
    The number of milliseconds since the server was started [ 30146]
    Request ID [ 13]
    Logging level INFO
    Location in the TDS code where the message was generated thredds.server.catalogservice.LocalCatalogServiceController - handleRequestInternal()
    Content of the log message Remote host: 127.0.0.1 - Request: "GET /thredds/catalog/fmrc/NCEP/GFS/Puerto_Rico_191km/catalog.html HTTP/1.1"

Transaction bracketing in threddsServlet.log

  • Each HTTP request (or transaction) handled by the TDS results in at least two log messages:
    • a message indicating the start of the transaction; and
    • a message indicating the end of the transaction.
  • Between the start and end messages, there may be other entries for this same request (e.g., DEBUG, WARN, ERROR, etc.).
  • All log messages for the same request will have the same request ID.
  • Log messages from multiple requests can be interspersed in the log file.
  • When tracking down an error, it is useful to find a specific example of a request that causes that error and extract all the log messages for that request.

Start of transaction

The main message in a start of transaction log message includes:

  • IP address of the requester; and
  • request being made.

For example:

... Remote host: 128.117.140.75 - Request: "GET /thredds/catalog.html HTTP/1.1"

End of transaction

The main message in a end of transaction log message includes:

  • server response code for the response;
  • size of the response body in bytes; and
  • time (in milliseconds) to handle the request.

For example:

... Request Completed - 200 - 3403 - 1

Most end of transaction messages include a status code that matches a standard HTTP status code. However, when the client breaks the connection or a request is forwarded to another TDS internal service, one of three TDS specific status codes are used:

  • "1000 (Client Abort)" - the connection to the client was broken;
  • "1001 (Forwarded)" - the request was forwarded to another TDS internal service; or
  • "1002 (Forward Failed)" - an attempt was made but failed to forward the request to another TDS internal service.

If the request is forwarded successfully, another set of transaction messages should be initiated with the ending message containing a standard HTTP status code. If forwarding the request fails, another set of transaction message may or may not be initiated and the entry in the access log will probably either log a 404 (File Not Found) or a 500 (Internal Server Error) HTTP status code.

Looking at threddsServlet.log data

Here is an example of some log messages from a request that resulted in error messages:

2011-07-19T15:05:41.886 -0700 [    131421][      16] INFO  - thredds.server.opendap.OpendapServlet - doGet(): Remote host: 127.
0.0.1 - Request: "GET /thredds/dodsC/fsl/.scour*.html HTTP/1.1"
2011-07-19T15:05:41.981 -0700 [    131516][      16] ERROR - thredds.server.opendap.OpendapServlet - path= /fsl/.scour*.html
java.io.IOException: Cant read /data/ldm/fsl/.scour*: not a valid CDM file.
        at ucar.nc2.NetcdfFile.open(NetcdfFile.java:743)
        at ucar.nc2.NetcdfFile.open(NetcdfFile.java:387)
 ...
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Thread.java:662)
2011-07-19T15:05:41.982 -0700 [    131517][      16] INFO  - thredds.server.opendap.OpendapServlet - Request Completed - 500 - -1 - 96

Keep in Mind

Log messages from multiple requests can be interspersed in the log file.

In the above example, note:

  • the request ID (16) matches in all messages;
  • the ERROR message contains a stack trace; and
  • the status code in the end of transaction message is 500 (Internal Server Error) .

Exercise Three: Find an example of error transition bracketing in your own threddsServlet.log.

  • Yesterday afternoon you performed an exercise specify which datasets are to be included in the generated catalogs (via the filter element in the datasetscan element). During the exercise, errors were generated and logged to the threddsServlet.log.
  1. Use the grep command to find out which threddsServlet.log file contains the error message:
  2. $ cd /home/tds/tds/apache-tomcat-6.0.32/content/thredds/logs
    $ grep ERROR threddsServlet.log*
    
  3. Open the threddsServlet.log file which contains the error message and find all the message transaction entries that correspond to that error.

Monitoring and Debugging Tools

Log analyzers

There are a variety of OpenSource and commercial log analysis packages you may wish to employ. Below are some resources to help locate a package:

Firefox Add-on: Live HTTP Headers

  • "Free" Firefox add-on
  • Live HTTP Headers is accessed through the Firefox Tools menu.
  • Host, server, and cookie information is displayed in the first tab.
  • Once it's running, it automatically updates if you move to a new page.
  • The Generator tool shows redirects, images, and CSS on a page.
  • It's easy to check and uncheck options to view.
  • Information is easily saved in text files.

Live HTTP Headers

Exercise Four: Use Live HTTP Headers.

  1. Clear any authenticated browser sessions you may have running.
  2. Open Live HTTP Headers
  3. Go to the Tomcat manager application in your browser: http://localhost:8080/manager/html/ and login.
  4. Look at the resulting output in the Live HTTP Headers window.

Tools that come with the JDK: JConsole and JVisualVM

JVM monitoring tools

  • JVisualVM - Java Virtual Machine Monitoring, Troubleshooting and Profiling Tool.
  • JConsole - Java Monitoring and Management Console.
  • Both come packaged with the JDK (found in $JAVA_HOME/bin).
  • Both GUI tools to provide information about the JVM in a visual format.
  • Used for monitoring, troubleshooting, and profiling what is happening in the JVM.
  1. Use JConsole to track the JVM running Tomcat.
  2. Invoke the jconsole program in $JAVA_HOME/bin:

    $ cd $JAVA_HOME/bin
    $ ./jconsole
    

    If you happen to be running more than one JVM on your machine, look to see what Process ID # for the JVM running Tomcat:

    $ ps -ef | grep tomcat
    tds   5168     1  5 14:01 pts/2    00:00:12 /home/tds/tds/jdk1.6.0_22/bin/java -Djava.util.logging.config.file=/home/tds/tds/apache-tomcat-6.0.32/conf/logging.properties -Xmx256m -Xms256m -server -Djava.awt.headless=true -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.endorsed.dirs=/home/tds/tds/apache-tomcat-6.0.32/endorsed -classpath :/home/tds/tds/apache-tomcat-6.0.32/bin/bootstrap.jar -Dcatalina.base=/home/tds/tds/apache-tomcat-6.0.32 -Dcatalina.home=/home/tds/tds/apache-tomcat-6.0.32 -Djava.io.tmpdir=/home/tds/tds/apache-tomcat-6.0.32/temp org.apache.catalina.startup.Bootstrap start
    thredds   5451  4285  0 14:05 pts/2    00:00:00 grep tomcat
    

    Select the local JVM process corresponding to Tomcat and connect:

    Starting JConsole

  3. Try to create a PermGen error via the Tomcat manager application.
  4. In your browser window, continually Start, Stop and reload the TDS and other web applications using the manager application.

    In the meantime, chart the Memory Pool "PS Perm Gen" in the JConsole Memory tab. Also monitor the Heap Memory Usage and Class Loading displays in the Overview tab.

  5. Try to create a PermGen error via the Tomcat manager application.
  6. Identify PermGen as it happens.
  7. The PermGen error manifests itself in the following ways:

    1. The manager application stops responding.
    2. JConsole will loose its connection with the Tomcat JVM:
    3. PermGen Error

    4. The following error is reported in catalina.out:
    5. Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: PermGen space
      
    6. In the two JConsole screen captures below, you can see the memory usage of PermGen increased with every reload of the application until it reaches the point where the JVM crashes with the OutOfMemory error. The default maximum PermGen amount of JVM is 64 MB.

      PermGen Error

      PermGen Error

  8. Please Note: you will most likely have to forcefully kill the running tomcat process, as the $CATALINA_HOME/bin/shutdown.sh will be unresponsive.
    1. Find the Tomcat process running and get the process ID number:
    2. $ ps -ef | grep tomcat
      tds   5168     1  5 14:01 pts/2    00:00:12 /home/tds/tds/jdk1.6.0_22/bin/java -Djava.util.logging.config.file=/home/tds/tds/apache-tomcat-6.0.32/conf/logging.properties -Xmx256m -Xms256m -server -Djava.awt.headless=true -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.endorsed.dirs=/home/tds/tds/apache-tomcat-6.0.32/endorsed -classpath :/home/tds/tds/apache-tomcat-6.0.32/bin/bootstrap.jar -Dcatalina.base=/home/tds/tds/apache-tomcat-6.0.32 -Dcatalina.home=/home/tds/tds/apache-tomcat-6.0.32 -Djava.io.tmpdir=/home/tds/tds/apache-tomcat-6.0.32/temp org.apache.catalina.startup.Bootstrap start
      tds   5451  4285  0 14:05 pts/2    00:00:00 grep tomcat
      
    3. Kill it!
    4. $ kill -9 5168
      $ ps -ef | grep tomcat
      tds   5694  4285  0 14:20 pts/2    00:00:00 grep tomcat
      

Exercise Five: Use JConsole to monitor the JVM during a manager-induced PermGen error.

  1. Using the steps outlined above, try to induce a PermGen error using the manager application and monitor it via JConsole.

THREDDS This document is maintained by Unidata. Send comments to THREDDS support.