[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[THREDDS #OWH-501524]: Bug between Matlab and Thredds for ncfiles with many variables



It turns out that there are multiple problem
demonstrated by your example.
Since you are using matlab, that means you
are using indirectly the netcdf-c library.
Some experimentation shows that your example
surfaced at least the following problems in that library.
1. a piece of code that parses the DAS turns out to
   have an o(n-cubed) running time. 
2. The netcdf-c library by default attempts to prefetch
   all variables under a certain size and in a single request
   to the server. The reason is that these tend to be coordinate
   variables (== Grid maps). It does this by createing a request
   of the form "http://...?v1,v2,...vn"; where the vi are the names
   of the variables to prefetch.  If the set of vi is very long
   (somewhere over 200 variables), then the request gets rejected
   by the server. Whether it is the thredds server or Apache or Tomcat
   that is rejecting the request, I cannot yet tell.

For #1, I am changing this to be o(n-squared), which appears to be adequate
for your 3000 variable (and 3000 attribute) example. If the problem continues,
then I will have to rethink this again. Of course, my fix will do you
no good unless you can rebuild the netcdf-c library and get matlab to use it
(sorry).

Problem #2 is difficult. I do not know if the request limit is
specific to that server or to thredds generically. I will have
to do some testing to see. Frankly, I do not have a good idea
how to fix this at the moment.
There is a possible workaround, but it may not work with matlab.
If you change the url you are using
(http://opendap.deltares.nl/opendap/test/testNcWithManyVariables/test3000.nc)
by prefixing it with "[noprefetch]" so you get
"[noprefetch]http://opendap.deltares.nl/opendap/test/testNcWithManyVariables/test3000.nc";,
then no prefetching will be done. The price is that small variables will
be prefetched as needed and so there will be a performance penalty.
The size of the penalty depends on your access pattern.
WARNING: matlab may not take the above "[noprefetch]http://...";
form since it does not look like a normal url. If it does not take it,
then try this alternate form:
http://opendap.deltares.nl/opendap/test/testNcWithManyVariables/test3000.nc#noprefetch
This may work if the version of the netcdf library used by matlab
is sufficiently recent.

Again, thanks for providing this example. It is turning out to be very useful.

=Dennis Heimbigner
  Unidata


Ticket Details
===================
Ticket ID: OWH-501524
Department: Support THREDDS
Priority: Normal
Status: Closed