Re: [thredds] syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET

  • To: Sean Arms <sarms@xxxxxxxx>
  • Subject: Re: [thredds] syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET
  • From: Nikhil Garg <nikhilgarg.gju@xxxxxxxxx>
  • Date: Thu, 7 May 2020 09:40:06 +1000
Hi Sean,

Thanks for the detailed reply. I fixed the issue using the first suggested
solution as the server will be visible on an internal secure network so I
am not too worried about the security implications.

Cheers,
-nikhil

On Tue, 5 May 2020 at 23:21, Sean Arms <sarms@xxxxxxxx> wrote:

> Greetings Nikhil,
>
> In your example output, we see the following message from the server:
>
> HTTP Status 400 – Bad Request: Invalid character found in the request
> target. The valid characters are defined in RFC 7230 and RFC 3986
>
> This is generated by Tomcat, and most likely because the request
> contains unencoded square brackets. Using curl can simplify the
> troubleshooting process a bit. For example:
>
> curl -i
> http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.ncnetcdf.ascii?time
>
> will likely work just fine. Now, if we make a request that slices the
> time variable, like so:
>
> curl -i
> http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.ncnetcdf.ascii?time[0]
>
> this will fail.
>
> The issue here is that starting with Tomcat v9.0.8 (and also v8.5.31,
> v8.0.52 and v7.0.87), the presence of certain characters in the query
> portion of a  URI (things after the question mark) will cause Tomcat
> to reject a request without question, and the underlying application
> will never see the request. Those characters are:
>
> < > [ \ ] ^ ` { | }
>
> In this case, it's the square brackets, and any time a client makes a
> subset request to OPeNDAP, square brackets are involved at some level.
> It turns out that square brackets were never considered "legal" for
> use in the query part of a URI, but Tomcat (and many others) have
> historically allowed them to pass through to the underlying
> application (in this case, the TDS), but that's no longer the case. If
> you want to make a request with those characters, you have to encode
> them first. When encoded, [ becomes %5B, and ] becomes %5D. If you
> replace the brackets with their encoded values and run the curl
> command again, I'm guessing things will work as expected:
>
> curl -i
> http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.ncnetcdf.ascii?time%5B0%5D
>
> Given all of that, there are basically two paths forward to get you up
> and running.
>
> 1. Allow Tomcat to let plain square brackets through.
>
> This is done on the server side, and will tell tomcat that square
> brackets are ok. At that point, the TDS will actually see the request
> and have a chance to return data. HOWEVER, there are reasons Tomcat
> chose to start blocking square brackets by default. Check out
> relaxedQueryChars at
> https://tomcat.apache.org/tomcat-9.0-doc/config/http.html. Some people
> view this as a security issue, and while I don't fully understand, I
> come down on the side of caution and would not recommend it in the
> general case. If you decide to go this route, research it well and
> know the risks - it's all going to be highly dependent on the
> environment in which you are running Tomcat (internal trusted clients
> only, which sounds like the case here, vs a server exposed to the
> world, for example).
>
> 2. Percent encode the query parts of a URL before making a request.
>
> It looks like NCO (or a library it uses) is making unencoded requests.
> I'm guessing NCO is relying on the netCDF-C library to make OPeNDAP
> requests. I believe a fix was introduced in netCDF-C last summer
> (https://github.com/Unidata/netcdf-c/pull/1439), so this may be
> something that can be fixed by rebuilding NCO with a more recent
> version of netCDF-C. Perhaps this version dependency is why NCO on one
> machine can make successful requests, while another cannot?
>
> Cheers,
>
> Sean
>
> Sean Arms, PhD
> Software Engineer
> UCAR/UCP/Unidata
> https://staff.ucar.edu/users/sarms
>
>
> On Mon, May 4, 2020 at 5:51 PM Nikhil Garg <nikhilgarg.gju@xxxxxxxxx>
> wrote:
> >
> > Hi,
> >
> > I am setting up a thredds data server on a local machine to serve some
> data for private use within our research group. I have used the docker
> image unidata/thredds-docker:latest to install and setup the server. I have
> also managed to create and add a catalog. After setting up the server, I
> have noticed a peculiar behaviour. If I use tools like nco etc. on the
> machine where I have installed a thredds server, I am able to access the
> data. However, when I try to access the data from a different machine, I
> get a syntax error.
> >
> > I have tested to see if the file is accessible from the thredds server.
> Below, you can see the output from ncdump.
> >
> > ncdump -t -v time
> http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.ncnetcdf
> wc2.1_2.5m_prec {
> > dimensions:
> > time = UNLIMITED ; // (12 currently)
> > lat = 4320 ;
> > lon = 8640 ;
> > variables:
> > double time(time) ;
> > time:standard_name = "time" ;
> > time:long_name = "time" ;
> > time:units = "months since 1985-01-01" ;
> > time:calendar = "standard" ;
> > time:axis = "T" ;
> > time:_ChunkSizes = 512 ; // "2027-09-01"
> > double lon(lon) ;
> > lon:standard_name = "longitude" ;
> > lon:long_name = "longitude" ;
> > lon:units = "degrees_east" ;
> > lon:axis = "X" ;
> > double lat(lat) ;
> > lat:standard_name = "latitude" ;
> > lat:long_name = "latitude" ;
> > lat:units = "degrees_north" ;
> > lat:axis = "Y" ;
> > float prec(time, lat, lon) ;
> > prec:units = "mm" ;
> > prec:_FillValue = -32768.f ;
> > prec:missing_value = -32768.f ;
> > prec:long_name = "precipitation" ;
> > prec:_ChunkSizes = 1, 7, 8640 ;
> >
> > // global attributes:
> > :CDI = "Climate Data Interface version 1.9.3 (http://mpimet.mpg.de/cdi)"
> ;
> > :history = "Thu Apr 30 17:43:57 2020: GDAL CreateCopy( ./
> wc2.1_2.5m_prec_01.nc, ... )" ;
> > :GDAL_AREA_OR_POINT = "Area" ;
> > :GDAL = "GDAL 2.1.3, released 2017/20/01" ;
> > :CDO = "Climate Data Operators version 1.9.3 (http://mpimet.mpg.de/cdo)"
> ;
> > :Conventions = "CF-1.5" ;
> > :DODS_EXTRA.Unlimited_Dimension = "time" ;
> > data:
> >
> >  time = "1985-01-01", "1985-02-01", "1985-03-01", "1985-04-01",
> "1985-05-01",
> >     "1985-06-01", "1985-07-01", "1985-08-01", "1985-09-01", "1985-10-01",
> >     "1985-11-01", "1985-12-01" ;
> > }
> >
> > When I try to access the file using ncks e.g.,
> >
> > ncks -d time,0,1,1 -v prec
> http://138.194.55.191/thredds/dodsC/worldclim/historical/2.5m/prec/wc2.1_2.5m_prec.nc
> temp.nc
> >
> > I get the following error
> >
> > syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET
> or SCAN_ERROR
> > context: <!doctype^ html><html lang="en"><head><title>HTTP Status 400 –
> Bad Request</title><style type="text/css">body
> {font-family:Tahoma,Arial,sans-serif;} h1, h2, h3, b
> {color:white;background-color:#525D76;} h1 {font-size:22px;} h2
> {font-size:16px;} h3 {font-size:14px;} p {font-size:12px;} a {color:black;}
> .line
> {height:1px;background-color:#525D76;border:none;}</style></head><body><h1>HTTP
> Status 400 – Bad Request</h1><hr class="line" /><p><b>Type</b> Exception
> Report</p><p><b>Message</b> Invalid character found in the request target .
> The valid characters are defined in RFC 7230 and RFC
> 3986</p><p><b>Description</b> The server cannot or will not process the
> request due to something that is perceived to be a client error (e.g.,
> malformed request syntax, invalid request message framing, or deceptive
> request
> routing).</p><p><b>Exception</b></p><pre>java.lang.IllegalArgumentException:
> Invalid character found in the request target. The valid characters are
> defined in RFC 7230 and RFC 3986
> org.apache.coyote.http11.Http11InputBuffer.parseRequestLine(Http11InputBuffer.java:502)
> org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:502)
> org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:65)
> org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:818)
> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1623)
> org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
> java.lang.Thread.run(Thread.java:748)</pre><p><b>Note</b> The full stack
> trace of the root cause is available in the server logs.</p><hr
> class="line" /><h3>Apache Tomcat</h3></body></html>
> >
> > I am not sure what I am missing here. I would appreciate some help in
> getting this sorted.
> >
> > --
> > Regards
> >
> > Nikhil
> > _______________________________________________
> > NOTE: All exchanges posted to Unidata maintained email lists are
> > recorded in the Unidata inquiry tracking system and made publicly
> > available through the web.  Users who post to any of the lists we
> > maintain are reminded to remove any personal information that they
> > do not want to be made public.
> >
> >
> > thredds mailing list
> > thredds@xxxxxxxxxxxxxxxx
> > For list information or to unsubscribe,  visit:
> https://www.unidata.ucar.edu/mailing_lists/
  • 2020 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: