Re: [thredds] syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET

  • To: Nikhil Garg <nikhilgarg.gju@xxxxxxxxx>
  • Subject: Re: [thredds] syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET
  • From: Sean Arms <sarms@xxxxxxxx>
  • Date: Tue, 5 May 2020 07:20:53 -0600
Greetings Nikhil,

In your example output, we see the following message from the server:

HTTP Status 400 – Bad Request: Invalid character found in the request
target. The valid characters are defined in RFC 7230 and RFC 3986

This is generated by Tomcat, and most likely because the request
contains unencoded square brackets. Using curl can simplify the
troubleshooting process a bit. For example:

curl -i 
http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.ncnetcdf.ascii?time

will likely work just fine. Now, if we make a request that slices the
time variable, like so:

curl -i 
http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.ncnetcdf.ascii?time[0]

this will fail.

The issue here is that starting with Tomcat v9.0.8 (and also v8.5.31,
v8.0.52 and v7.0.87), the presence of certain characters in the query
portion of a  URI (things after the question mark) will cause Tomcat
to reject a request without question, and the underlying application
will never see the request. Those characters are:

< > [ \ ] ^ ` { | }

In this case, it's the square brackets, and any time a client makes a
subset request to OPeNDAP, square brackets are involved at some level.
It turns out that square brackets were never considered "legal" for
use in the query part of a URI, but Tomcat (and many others) have
historically allowed them to pass through to the underlying
application (in this case, the TDS), but that's no longer the case. If
you want to make a request with those characters, you have to encode
them first. When encoded, [ becomes %5B, and ] becomes %5D. If you
replace the brackets with their encoded values and run the curl
command again, I'm guessing things will work as expected:

curl -i 
http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.ncnetcdf.ascii?time%5B0%5D

Given all of that, there are basically two paths forward to get you up
and running.

1. Allow Tomcat to let plain square brackets through.

This is done on the server side, and will tell tomcat that square
brackets are ok. At that point, the TDS will actually see the request
and have a chance to return data. HOWEVER, there are reasons Tomcat
chose to start blocking square brackets by default. Check out
relaxedQueryChars at
https://tomcat.apache.org/tomcat-9.0-doc/config/http.html. Some people
view this as a security issue, and while I don't fully understand, I
come down on the side of caution and would not recommend it in the
general case. If you decide to go this route, research it well and
know the risks - it's all going to be highly dependent on the
environment in which you are running Tomcat (internal trusted clients
only, which sounds like the case here, vs a server exposed to the
world, for example).

2. Percent encode the query parts of a URL before making a request.

It looks like NCO (or a library it uses) is making unencoded requests.
I'm guessing NCO is relying on the netCDF-C library to make OPeNDAP
requests. I believe a fix was introduced in netCDF-C last summer
(https://github.com/Unidata/netcdf-c/pull/1439), so this may be
something that can be fixed by rebuilding NCO with a more recent
version of netCDF-C. Perhaps this version dependency is why NCO on one
machine can make successful requests, while another cannot?

Cheers,

Sean

Sean Arms, PhD
Software Engineer
UCAR/UCP/Unidata
https://staff.ucar.edu/users/sarms


On Mon, May 4, 2020 at 5:51 PM Nikhil Garg <nikhilgarg.gju@xxxxxxxxx> wrote:
>
> Hi,
>
> I am setting up a thredds data server on a local machine to serve some data 
> for private use within our research group. I have used the docker image 
> unidata/thredds-docker:latest to install and setup the server. I have also 
> managed to create and add a catalog. After setting up the server, I have 
> noticed a peculiar behaviour. If I use tools like nco etc. on the machine 
> where I have installed a thredds server, I am able to access the data. 
> However, when I try to access the data from a different machine, I get a 
> syntax error.
>
> I have tested to see if the file is accessible from the thredds server. 
> Below, you can see the output from ncdump.
>
> ncdump -t -v time 
> http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.ncnetcdf
>  wc2.1_2.5m_prec {
> dimensions:
> time = UNLIMITED ; // (12 currently)
> lat = 4320 ;
> lon = 8640 ;
> variables:
> double time(time) ;
> time:standard_name = "time" ;
> time:long_name = "time" ;
> time:units = "months since 1985-01-01" ;
> time:calendar = "standard" ;
> time:axis = "T" ;
> time:_ChunkSizes = 512 ; // "2027-09-01"
> double lon(lon) ;
> lon:standard_name = "longitude" ;
> lon:long_name = "longitude" ;
> lon:units = "degrees_east" ;
> lon:axis = "X" ;
> double lat(lat) ;
> lat:standard_name = "latitude" ;
> lat:long_name = "latitude" ;
> lat:units = "degrees_north" ;
> lat:axis = "Y" ;
> float prec(time, lat, lon) ;
> prec:units = "mm" ;
> prec:_FillValue = -32768.f ;
> prec:missing_value = -32768.f ;
> prec:long_name = "precipitation" ;
> prec:_ChunkSizes = 1, 7, 8640 ;
>
> // global attributes:
> :CDI = "Climate Data Interface version 1.9.3 (http://mpimet.mpg.de/cdi)" ;
> :history = "Thu Apr 30 17:43:57 2020: GDAL CreateCopy( 
> ./wc2.1_2.5m_prec_01.nc, ... )" ;
> :GDAL_AREA_OR_POINT = "Area" ;
> :GDAL = "GDAL 2.1.3, released 2017/20/01" ;
> :CDO = "Climate Data Operators version 1.9.3 (http://mpimet.mpg.de/cdo)" ;
> :Conventions = "CF-1.5" ;
> :DODS_EXTRA.Unlimited_Dimension = "time" ;
> data:
>
>  time = "1985-01-01", "1985-02-01", "1985-03-01", "1985-04-01", "1985-05-01",
>     "1985-06-01", "1985-07-01", "1985-08-01", "1985-09-01", "1985-10-01",
>     "1985-11-01", "1985-12-01" ;
> }
>
> When I try to access the file using ncks e.g.,
>
> ncks -d time,0,1,1 -v prec 
> http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.nc
>  temp.nc
>
> I get the following error
>
> syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or 
> SCAN_ERROR
> context: <!doctype^ html><html lang="en"><head><title>HTTP Status 400 – Bad 
> Request</title><style type="text/css">body 
> {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b 
> {color:white;background-color:#525D76;} h1 {font-size:22px;} h2 
> {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;} 
> .line 
> {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP
>  Status 400 – Bad Request</h1><hr class="line" /><p><b>Type</b> Exception 
> Report</p><p><b>Message</b> Invalid character found in the request target. 
> The valid characters are defined in RFC 7230 and RFC 
> 3986</p><p><b>Description</b> The server cannot or will not process the 
> request due to something that is perceived to be a client error (e.g., 
> malformed request syntax, invalid request message framing, or deceptive 
> request 
> routing).</p><p><b>Exception</b></p><pre>java.lang.IllegalArgumentException: 
> Invalid character found in the request target. The valid characters are 
> defined in RFC 7230 and RFC 3986 
> org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:502)
>  org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:502) 
> org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
>  
> org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:818)
>  
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1623)
>  
> org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
>  
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
>  java.lang.Thread.run(Thread.java:748)</pre><p><b>Note</b> The full stack 
> trace of the root cause is available in the server logs.</p><hr class="line" 
> /><h3>Apache Tomcat</h3></body></html>
>
> I am not sure what I am missing here. I would appreciate some help in getting 
> this sorted.
>
> --
> Regards
>
> Nikhil
> _______________________________________________
> NOTE: All exchanges posted to Unidata maintained email lists are
> recorded in the Unidata inquiry tracking system and made publicly
> available through the web.  Users who post to any of the lists we
> maintain are reminded to remove any personal information that they
> do not want to be made public.
>
>
> thredds mailing list
> thredds@xxxxxxxxxxxxxxxx
> For list information or to unsubscribe,  visit: 
> https://www.unidata.ucar.edu/mailing_lists/


  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: