Unidata - To provide the data services, tools, and cyberinfrastructure leadership that advance Earth system science, enhance educational opportunities, and broaden participation. Unidata
         
  advanced  
 

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Structures in Dapper dds & Loaddap



Hi All,

Just a bit of clarification on the Dapper attributes structures... Many in-situ datasets have two classes of attributes -- those that are a function of a given profile or time series (e.g. who the PI was, who made the measurement, what instrument was used for the measurement) and those that are also a function of a given profile but are also associated with a variable (e.g. the valid data range for, say, the pressure). The first class is represented by the 'attributes' structure and the second by the 'variable_attributes' structure. The reason that a nested structure is used for the 'variable_attributes' structure:

       Structure {
           Structure {
               Float32 valid_range[2];
           } PRES;
       } variable_attributes;

is that the attribute needs to be associated with a variable. If the structure wasn't nested, then some kind of naming convention would be required:

       Structure {
           Float32 valid_range[2];
       } PRES_ATTRIBUTE;

which IMHO would make life harder for clients attempting to recover the variable attributes.

- Joe


Thomas LOUBRIEU wrote:
Hi Dan,
Thanks very much for your quick response.
It will take us some time to fully take into account the information you gave us, but I am already very happy to read the reason why the 'attributes' structure exists in dapper (for keeping its fields separated from the usual profile dimensions (x,y,z,t)).
Actually I didn't understood that by myself. That 'attributes' structure used to look strange to me, but now I think it's a wonderful idea.


Bye,

Thomas




Daniel Holloway wrote:
Hi Arnaud,

There are several issues here, the primary culprit is handling complex sequences not structures. I'll provide some input inline below.

On May 4, 2006, at 10:09 AM, Arnaud FOREST wrote:

Hi all,

I work on an opendap server for in-situ vertical profiles stored in a Oracle DBMS.
Our dds structure is copied from the 'argo' dapper interface (http://dapper.pmel.noaa.gov/dapper/argo/argo_all.cdp.dds) but I notice that matlab struct tools(Libdap : 3.6.2, loaddap : 3.5.2, Matlab : 7 ) doesn't completly manage the in-situ profile dapper interface.


The 'structures' are especially not very well managed by the matlab client :

The problem is with handling nested sequences for this particular data source. The following data source is a complex structure that loads fine with loaddap(v-3.5.2)


    http://test.opendap.org:8080/dods/dts/complex_structs.03.dds

    ---------
>> loaddap('http://test.opendap.org:8080/dods/dts/complex_structs.03')
>> whos
  Name            Size                    Bytes  Class

  Outermost       1x1                      1472  struct array

Grand total is 68 elements using 1472 bytes

>> Outermost

Outermost =

    SimpleStructure: [1x1 struct]

>> Outermost.SimpleStructure

ans =

    Innermost: [1x1 struct]

>> Outermost.SimpleStructure.Innermost

ans =

     i32: [10x1 double]
    ui32: [10x1 double]
     i16: [10x1 double]
    ui16: [10x1 double]
     f32: [10x1 double]
     f64: [10x1 double]

>> Outermost.SimpleStructure.Innermost.i32

ans =

           0
        2048
        4096
        6144
        8192
       10240
       12288
       14336
       16384
       18432

>>

---------

Place those structs inside a Sequence and baboom.... So yes, there is a bug in loaddap with respect to these nested sequences. I didn't realize it was there as I used loaddap recently to load another Dapper data source and it worked fine.

---------
>> clear
>> loaddap('http://las.pfeg.noaa.gov/dods/ndbcMet/all_noaa_time_series.cdp?LAT,LON,WSPD1&LAT>39.58&LAT<43&TIME<=1105315200000&TIME>=1104537600000')


>> whos
  Name           Size                    Bytes  Class

  location       1x1                     11348  struct array

Grand total is 1328 elements using 11348 bytes

>> location

location =

    profile: [6x1 struct]
        LON: [6x1 double]
        LAT: [6x1 double]

>> location.LAT(2)

ans =

   42.7500

>> location.LON(2)

ans =

  235.1500


>> location.profile(2).WSPD1(1:5)

ans =

    5.1000
    5.8000
    5.9000
    7.2000
    4.6000

>>
---------------

So, I was not aware that Dapper output would cause loaddap to fail, so I'll look into what the problem might be that this particular data source is causing the client.

    Nested Sequences can be a bear to deal with...


1) "Loaddap" doesn't support the structure of structure, a fatal error occures. (the matlab's crash dump is at the end of the
message)




Is there a known bug or updates foreseen about the structure management in the loaddap matalb toolbox ?

There is now, at least for this particular data source.

Why the dapper interface has been defined with so many structures in it?


I shouldn't try to explain the rationale behind the Dapper use of Sequences but IMO it's a pretty good representation of the underlying relationships.

Basically Dapper provides the following form for all of its data sources:

      Dataset {

           Sequence {  ...  } location;

           Structure {  independent dims } constrained_ranges;

      }  data source;


Each data source has a series of location data, and a single structure listing the extent of the independent dimensions (x,y,z,t) for the data source itself (as constrained by the request)


------------

To represent the 'location' series Dapper uses a nested sequence to represent the relationships between the parts of the series, such that values which do not change for a particular series which in this case are the lat/lon/juld variables are stored in the outer sequence element. The variables that change most frequently, whether that is a time-series, or vertical profile, etc., are stored in the inner sequence, in this case the variable 'profile'.

-----------
Dataset {
    Sequence {
        Float64 JULD;
        Float32 LONGITUDE;
        Float32 LATITUDE;
        Int32 _id;
        Sequence {
            Float32 PSAL_QC;
            Float32 CNDC_ADJUSTED_QC;
            Float32 TEMP_ADJUSTED_ERROR;
            ...
       } profile;
       ...
     } location;
 }

------------

Logically, you have a set of locations, which have a JULD, LAT, LON, and foreach of these you have a series of observations (profile).

The potentially confusing part is that for this data source Dapper includes two additional structure variables, but it's important to note that these structure variables reside in the outermost sequence element of 'location'. That means that the values in the 'attribute' structure and the 'variable_attributes' structure apply to the outermost relationship. Long story short the values in 'attributes' like 'PLATFORM_NUMBER' don't change as a function of the 'profile' series. The nested Structure 'variable_attributes' list the extent or range of the independent dimension implicit within the nested sequence 'profile', which in this example is PRES (pressure).

Extending the above, you have a set of locations, which have a JULD, LAT, LON and attributes like PLATFORM_NUMBER, etc., and the range on the independent dim (PRES) for the series 'profile' is contained in the structure 'variable_attributes.PRES.valid_ranges[0:1], and all the observations recorded at that location are stored in the profile series which happens to use PRES as the independent dimension between observations.

------------

OK, I've probably butchered that explanation... There are a couple of issues at work here that you should be aware of:

1: Not sure why they use a nested structure for 'variable_attributes', maybe they envision supporting more than 2 levels of nesting to represent a complicated relationship but I think you don't have to have nested structures for this particular variable.

2: If you write a server that uses nested sequences the server should serialize any constructor variables (.e.g., structures) before the inner nested sequence itself. I believe this is documented in a recent RFC on the DAP but I'll have to double check. Logically from a client's perspective there's not difference between the ordering but from the existing implementations standpoint there is a big difference. I'm not sure if that's the reason why the client is failing for this particular data source or not but will look into it.


Does anyone know what opendap clients are fully compliant with the dapper output (structures of structures, sequences of sequences, sequences of structures...) and what is planned for the improvement of the compliance between dapper interface and opendap clients (matlab, ferret, nco, Opendap Data connector, pyDAP, GrADS... I guess C++ and JAVA API are right).



I can't speak for other client developers, we will make every effort to insure that our supported clients can read any valid DAP response. We support the Matlab, IDL and ODC, as well as the API implementations we distribute. I doubt if every client application will be able to support reading Dapper responses, they won't easily map into some of the underlying client APIs.


   Dan


Thanks a lot,

Arnaud and Thomas

--------------------- Matlab Crash Dump ---------------------


>>loaddap('http://dapper.pmel.noaa.gov/dapper/argo/argo_all.cdp?
&location.JULD>1143929418000&location.JULD<1144447818000&location.LATITUDE>30


&location.LATITUDE<50&location.LONGITUDE>-51&location.LONGITUDE<-5')

------------------------------------------------------------------------

Segmentation violation detected at Thu May 4 15:06:08 2006
------------------------------------------------------------------------


Configuration:
MATLAB Version: 7.0.1.24704 (R14) Service Pack 1
MATLAB License: 122551
Operating System: Linux 2.6.9-11.EL #1 Wed Jun 8 16:59:52 CDT 2005 i686
Window System: Hummingbird Communications Ltd. (7000), display br144-122:0.0
Current Visual: 0x23 (class 4, depth 24)
Processor ID: x86 Family 15 Model 0 Stepping 10, GenuineIntel
Virtual Machine: Java is not enabled
Default Charset: UTF-8
Register State:
eax = 00000000 ebx = 00cbc148
ecx = 02d09560 edx = 0896bd40
esi = 025250d0 edi = 00000003
ebp = bfff8bc8 esp = bfff8bc8
eip = 00cb737b flg = 00210286
Stack Trace:
[0] loaddap.mexglx:vfprintf~(0x0896bd40, 0x025250d0, 0xbfffafb0 "location", 0x00cb8d64) + 13179 bytes
[1] loaddap.mexglx:0x00cb8e6c(0x08642338, 0xbfffb20c, 0x00cba7c6, 1)
[2] loaddap.mexglx:0x00cb97ba(0x08642338, 0x00cba429, 1, 0xbfffbe30)
[3] loaddap.mexglx:mexFunction~(0, 0xbfffbdd0, 1, 0xbfffbe30) + 304 bytes
[4] libmex.so:mexRunMexFile(0, 0xbfffbdd0, 1, 0xbfffbe30) + 93 bytes
[5] libmex.so:Mfh_mex::dispatch_file(int, mxArray_tag**, int, mxArray_tag**)(0x02b40fb0, 0, 0xbfffbdd0, 1) + 537 bytes
[6] libmwm_dispatcher.so:Mfh_file::dispatch_fh(int, mxArray_tag**, int, mxArray_tag**)(0x02b40fb0, 0, 0xbfffbdd0, 1) + 262 bytes
[7] libmwm_interpreter.so:inDispatchFromStack(455, 0x0869ad20 "loaddap", 0, 1) + 1240 bytes
[8] libmwm_interpreter.so:inDispatchCall(char const*, int, int, int, int*, int*)(0x0869ad20 "loaddap", 455, 0, 1) + 112 bytes
[9] libmwm_interpreter.so:.L924(2, 0, 0, 0) + 165 bytes
[10] libmwm_interpreter.so:inInterPcodeSJ(inDebugCheck, int, int, opcodes, inPcodeNest_tag*)(2, 0, 0, 0) + 315 bytes
[11] libmwm_interpreter.so:inInterPcode(2, 0, 0xbfffc3f8, 0x0095e39b) + 93 bytes
[12] libmwm_interpreter.so:in_local_call_eval_function(int*, _pcodeheader*, int*, mxArray_tag**, inDebugCheck)(0, 0xbfffcdf0, 0xbfffce7c, 0xbfffcea8) + 163 bytes
[13] libmwm_interpreter.so:inEvalStringWithIsVarFcn(_memory_context*, char const*, EvalType, int, mxArray_tag**, inDebugCheck, _pcodeheader*, int*, bool (*)(void*, char const*), void*)(0x008dd468, 0x08744bd0 "loaddap('http://dapper.pmel.noaa..";, 0, 0) + 2358 bytes
[14] libmwm_interpreter.so:inEvalCmdNoEnd(0x08744bd0 "loaddap('http://dapper.pmel.noaa..";, 0x08744bd0 "loaddap('http://dapper.pmel.noaa..";, 0xbfffd048 ", 0x00de8c27) + 85 bytes
[15] libmwbridge.so:mnParser(0x00cd21e8 "@@@", 0x00cd22d8 "mnParser", 1, 0xbfffd0a4) + 471 bytes
[16] libmwmcr.so:mcrInstance::mnParser()(0x080a01c0, 0, 0xbffff398, 0x0804a902) + 96 bytes
[17] MATLAB:mcrMain(int, char**)(2, 0xbffff444, 0x0804ad1c, 0xbffff3b8) + 308 bytes
[18] MATLAB:main(2, 0xbffff444, 0xbffff450, 0x0047ebe6) + 23 bytes
[19] libc.so.6:__libc_start_main~(0x0804a7c4, 2, 0xbffff444, 0x0804a3d8) + 211 bytes
This error was detected while a MEX-file was running. If the MEX-file
is not an official MathWorks function, please examine its source code
for errors. Please consult the External Interfaces Guide for information
on debugging MEX-files.

FOREST Arnaud



 
 
  Contact Us     Site Map     Search     Terms and Conditions     Privacy Policy     Participation Policy
 
National Science Foundation (NSF) UCAR Office of Programs University Corporation for Atmospheric Research (UCAR)   Unidata is a member of the UCAR Office of Programs, is managed by the University Corporation for Atmospheric Research, and is sponsored by the National Science Foundation.
P.O. Box 3000     Boulder, CO 80307-3000 USA     Tel: 303-497-8643     Fax: 303-497-8690