Hi Rick,
I'm not much of an expert on this, but I thought that I would try to
access the same data set via a couple of different clients: the ODC
and Matlab. I also found some problem accessing the data set. I can
access a subset of the data, lat only here, all the way to the entire
variable, basically not subsetted:
>> clear all;close all
>>
loaddap('http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat[1:30000]'
<http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat%5B1:30000%5D%27>)
>> whos
Name Size Bytes Class
lat 1x1 480248 struct array
Grand total is 60002 elements using 480248 bytes
>> lat
lat =
lat: [30000x1 double]
ISTA: [30000x1 double]
>> figure(3);plot(lat.lat,lat.ISTA,'.')
>> clear all;close all
>>
loaddap('http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat[1:32856]'
<http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat%5B1:32856%5D%27>)
>> lat
lat =
lat: [32856x1 double]
ISTA: [32856x1 double]
But when I try to subset, it fails in Matlab:
>> clear all;close all
>>
loaddap('http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat[1:3:32856]'
<http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat%5B1:3:32856%5D%27>)
>> whos
>>
loaddap('http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat[1:2:32856]'
<http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat%5B1:2:32856%5D%27>)
>> whos
up to a certain point:
>>
loaddap('http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat[1:10:32856]'
<http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?lat%5B1:10:32856%5D%27>)
>> whos
Name Size Bytes Class
lat 1x1 52824 struct array
Grand total is 6574 elements using 52824 bytes
>> lat
lat =
lat: [3286x1 double]
ISTA: [3286x1 double]
>>
I can access subsets in the ODC as long as the two variables are each
limited to 0:9986:
http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?ISTA[0:1:9986],lat[0:1:9986]
<http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?ISTA%5B0:1:9986%5D,lat%5B0:1:9986%5D>
However
http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?ISTA[0:1:9987],lat[0:1:9987]
<http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?ISTA%5B0:1:9987%5D,lat%5B0:1:9987%5D>
fails with the following error:
Error getting data for
http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?ISTA[0:1:9987],lat[0:1:9987]
<http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods?ISTA%5B0:1:9987%5D,lat%5B0:1:9987%5D>
at 120032: error making connection to
http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods.dods?ISTA[0:1:9987],lat[0:1:9987]
<http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods.dods?ISTA%5B0:1:9987%5D,lat%5B0:1:9987%5D>:
scanning header: premature end of stream in header after 4 bytes read,
header: [100
]
Matlab works with the first request and fails with the second one.
Sooo.... it does not seem to be a client problem since it fails in
three very different clients. Two of the three clients that are having
problems are C++ and one is Java. It could be in the core classes used
by the clients, but that is doubtful given that we don't seem to see
this problem elsewhere. My guess is that it's in the server.
Furthermore, I don't think that Benno's server uses the core which
would rule out use of the core on the server side. Finally, to the
best of my knowledge, there is not a size constraint (as you suggest
below) in OPeNDAP, at least not for data sets of this size.
Another clue that might help track this down is that I can access
other subsets in the ODC, for example elements 9986:10000; i.e., the
problem in the ODC is not the 9987 number.
Peter
On Mar 10, 2006, at 11:30 AM, Rick Grubin wrote:
I have an OPeNDAP-enabled client (NCL http://www.ncl.ucar.edu) that
is trying to read the following file:
http://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods
The client successfully accesses the OPeNDAP server and retrieves dds
and das information, but fails when retrieving variable information:
ncvarget: Error in getting the data: Unknown error 1001
This particular data file has one dimension, its value is 32857
(number of stations). The two variables contained within the file
(ISTA, lat) are dimensioned with this value. Interestingly, dncdump
can access/read the file without difficulty.
I'm successful when accessing/reading a subsetted data file of the
larger, overall file (sorry for the weird URL):
http://iridl.ldeo.columbia.edu/expert/SOURCES/.NOAA/.NCDC/.GDCN/lat%286.666659%29%2836.66666%29masknotrange/SELECT/lat/%28-40.08333%29%28-11.66666%29masknotrange/SELECT/.lat/dods
In the course of debugging my OPeNDAP-enabled client, I find that the
error noted above occurs in the following calling sequence:
ncvarget ---> nc_get_vara ---> DODvario --->
Connect::request_data --->
HTTPConnect::fetch_url ---> HTTPConnect::plain_fetch_url --->
HTTPConnect::read_url
The url being read at the time of failure is:
http://http://iridl.ldeo.columbia.edu/expert/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods.dods?ISTA%5b0%3a1%3a32856%5d
It appears that the OPeNDAP library code is attempting to read all
32857 values for the variable ISTA (0 --> 32856) at a time; this
generates the error noted above.
Interestingly, dncdump successfully reads the entire file (first URL
noted above) every time. Note that dncdump does not try to read all
values for the variable ISTA at once; rather, it breaks up its
retrieval into 1000-element chunks. As an example, here's a URL that
dncdump is using (at the same end point (HTTPConnect::read_url) in
the calling sequence noted above):
http://http://iridl.ldeo.columbia.edu/expert/SOURCES/.NOAA/.NCDC/.GDCN/.lat/dods.dods?ISTA%5b2000%3a1%3a2999%5d
(reading elements 2000 -- 2999)
It would seem that there's an upper limit on how much data can be
retrieved at one time via the OPeNDAP libraries. True? I base this
assumption on the fact that an OPeNDAP-enabled client can't read a
relatively large chunk of data at once, whereas dncdump, because it
breaks up the retrieval of the data into "gulps" (1000 elements at a
time), always succeeds.
Note that this error is consistently repeatable; I can reproduce it
every time I run my client.
-Rick.
----
Rick Grubin NCAR/CISL/SCD/VETS
Visualization + Enabling Technologies
grubin@xxxxxxxx <mailto:grubin@xxxxxxxx> 303.497.1832
--
Peter Cornillon
Graduate School of Oceanography - Telephone: (401) 874-6283
University of Rhode Island - Fax:
(401) 874-6728
Narragansett, RI 02882 - E-mail:
pcornillon@xxxxxxxxxxx <mailto:pcornillon@xxxxxxxxxxx>