HTTP Processing

The Netcdf-Java library uses the Apache HttpClient 3.1 library for almost all HTTP processing (a small number of utilities use the java.net package). HttpClient is used to access OPeNDAP datasets, for WCS, WMS, and CdmRemote access, and to open remote files served over HTTP. The Netcdf-Java library provides ucar.nc2.util.net.HttpClientManager to manage HttpClient settings. Future versions of the Netcdf-Java library will use HttpComponents, which is commonly called the httpclient-4 library. It is the successor to HttpClient library 3. The HttpClientManager API will be kept as backwards compatible as possible.

You can allow default HttpClient settings to be used, unless you need to use HTTP Authentication to access restricted datasets.

Setting a Proxy Server

To use a proxy server, set the System properties http.proxyHost and http.proxyPort before making any calls to the Netcdf-Java library. One way to do this is to set them on the command line:

java -Dhttp.proxyHost=hostname -Dhttp.proxyPort=80 -classpath ...

HTTP Authentication

Overview

When dataset access must be restricted to authorized users, the server will issue an HTTP authentication challenge. We call these restricted datasets. The HttpClient library handles the details of the HTTP protocol, but the application layer must be responsible for supplying the credentials that authenticate the user. If you want to access restricted datasets with the Netcdf-Java library, you must plug-in a CredentialsProvider object (see below).

Authentication

Authentication means establishing the identity of a user. In most cases, this is done with a user name and password. A stronger way to do this is to use digital signatures with client certificates: a fairly complex process. The Netcdf-Java library supports HTTP Basic and Digest Authentication, with or without Secure Socket Layer (SSL) encryption. It also supports client-side keys with SSL encryption. Its up to the server to decide which kind of HTTP authentication is needed.

If you are writing an interactive client application, you might prompt the user for the user name and password. A non-interactive application needs to have some kind of a lookup table or database to supply the information.

As of version 4.3, the netcdf-java client side code manages such a databse for you. The key idea is that a single, global database of credentials is maintained. The key for the database is the combination of the authorization scheme plus a url. This key pair maps to an instance of a CredentialsProvider (typically provided by a user). At the time an HTTP method (Get, Put, etc.) is executed, the url indicates when to apply authorization (if the server requests it). The scheme indicates the kind of authorization scheme that is being used: HTTP Basic or Digest for example. The credentials provider is then invoked to compute the set of credentials to be sent to the server.

Currently the following schemes are supported.

See the file Session.html for details.

Authorization

Having established a user's identity, authorization is the process of deciding if that user has the right to access a particular dataset. Most servers, including the THREDDS Data Server (TDS), use role-based authorization. When a user is logged into a particular server, access is granted based on what roles the user has been given by that server. The practical effect of this is that if the user doesnt have access rights to a dataset, they are not prompted to enter a different username/password. They have to logout and login as a different user.

Sessions

As of 4.3, the notion of a session has changed. A session is encapsulated in an instance of the class HTTPSession. The encapsulation is with respect to a specific url. This means that once a session is specified, it is tied permanently to that url and "compatible" urls.

The session url in effect defines the role of the client. It can specify a host, port, user, and a path. Entries in the client-side authorization database are keyed on the scheme (BASIC, DIGEST, etc) and the url. The url may specify wildcards by omitting various elements of the url.

Servers that don't use sessions or other methods may require that the username/password be sent with every request. This is handled automatically because HTTPSession properly caches previously constructed credentials.

Plugging in a CredentialsProvider

In order to access restricted datasets with the Netcdf-Java library, you must plug-in a CredentialsProvider that implements the org.apache.commons.httpclient.auth.CredentialsProvider interface, which has one method:

public Credentials getCredentials(AuthScheme scheme,
                           String host,
                           int port,
                           boolean proxy)
                           throws CredentialsNotAvailableException

You can write your own, or, for GUI programs, use the thredds.ui.UrlAuthenticatorDialog class, which pops up a Dialog Box, similar to how FireFox and other browsers work.

You can set the credentials provider globally, or for a specific situation. Globally means that the credentials provider will be used for all instances of HTTPSession. An example of this would be as follows.

CredentialsProvider provider = new thredds.ui.UrlAuthenticatorDialog(frame);
HTTPSession.setGlobalCredentialsProvider(provider);
Alternately, you could specify that a url defining when the provider should be used.
HTTPSession.setGlobalCredentialsProvider(HTTPAuthScheme.BASIC,"http://hostx.org",provider);
In this case, this provider is used when BASIC authorization is being used but only for servers on the machine "hostx.org"

Initializing

The Netcdf-Java class ucar.nc2.util.net.HttpClientManager is a static class that provides a wrapping around HTTPSession to provide a number of useful methods that simplify various common operations. You must first initialize it using the static init method.

static public void init(CredentialsProvider provider, String userAgent);
The userAgent should be the name of your Application, which is added to the HTTP User-Agent header, and allows servers to track which applications are accessing it.

After initialization, you may invoke any of the HttpClientManager static methods to perform common tasks. Most of these methods have an instance of HTTPSession as an argument, but it is optional. If the value null is provided, the HttpClientManager will create and destroy its own instances.

If you have an idea for additional common operations for the HttpClientManager interface, feel free to send them to Unidata support.


This document is maintained by John Caron and was last updated Oct. 15, 2011