Re: [thredds] data won't return from

To: Spicer Bak <spicer.bak.frf@xxxxxxxxx>
Subject: Re: [thredds] data won't return from
From: Sean Arms <sarms@xxxxxxxx>
Date: Fri, 31 Jan 2020 15:42:41 -0700
Greetings Spicer,

I think there is an issue with your new project variable. In previous
files, it's a float,

https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/FRF_20191206_1179_FRF_NAVD88_LARC_GPS_UTC_v20191209_grid_latlon.nc.ascii?project%5B0:1:0%5D

Dataset {
    Float64 project[time = 1];
}
frf/geomorphology/DEMs/surveyDEM/FRF_20191206_1179_FRF_NAVD88_LARC_GPS_UTC_v20191209_grid_latlon.nc;
---------------------------------------------
project[1]
-999.0

but in the new latest file, it's a string:

https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/FRF_20200110_1180_FRF_NAVD88_LARC_GPS_UTC_v20200113_grid_latlon.nc.ascii?project

Dataset {
    String project;
}
frf/geomorphology/DEMs/surveyDEM/FRF_20200110_1180_FRF_NAVD88_LARC_GPS_UTC_v20200113_grid_latlon.nc;
---------------------------------------------
project, "F"

That might cause new kinds of issues for a full variable read.

Cheers!

Sean


On Fri, Jan 31, 2020 at 3:33 PM Spicer Bak <spicer.bak.frf@xxxxxxxxx> wrote:

> Hey Sean,
> Glad we were able to help find that bug, but I don't think the "project"
> variable (or lack of) is the root of our problem as i chose your option 3
> (my mistake, this was supposed to be the same after the last one) and i
> have similar response.  Good news, when i add the #noprefetch option, it
> seems to fix it.  hopefully this helps provide answers. Demonstrated by
> below code.
>
> # Failure with python (matlab as well)
> import netCDF4 as nc
> for url in urls:
>     print(nc.Dataset(url)['time'])
>     variables= nc.Dataset(url).variables.keys()
>     for var in variables:
>         try:
>             nc.Dataset(url)[var][0]
>             print('Success! {} from {}'.format(var, url))
>         except IndexError:
>             print("won't load variable {} from {}".format(var, url))
>             url += "#noprefetch"
>             nc.Dataset(url)[var][0]
>             print('Success! {} from {}'.format(var, url))
>         except IndexError as e:
>             print("FAIL: load variable {} from {}".format(var, url))
>
>             print('    {}'.format(e))
>
>
>
> On Fri, Jan 31, 2020 at 4:25 PM Sean Arms <sarms@xxxxxxxx> wrote:
>
>> I thought it was working for me, but for the wrong reasons. Sorry about
>> that. But, now I have it.
>>
>> The error message from the server is...well...garbage. That's something
>> we need to look into. The underlying problem here is that the new data file
>> does not include the variable project. Because variable "project" exists in
>> the other files (or more specifically, the first file of the aggregation),
>> and specifically because it has time as it's outer dimension, currently it
>> must exist in all files of the aggregation. So, we can make an ascii
>> request for that variable specifically, and request everything up to the
>> last value, and it works [0:1:424]:
>>
>>
>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml.ascii?project%5B0:1:424%5D
>>
>>
>> However, once we ask for the last value, it bombs [0:1:424] :
>>
>>
>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml.ascii?project%5B0:1:425%5D
>>
>> It's only like this for the full read code path, though. If I slice it
>> from [1:1:424] (skip the first value), it works
>>
>>
>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml.ascii?project%5B1:1:425%5D
>>
>> ...and the missing value gets filled with a zero, because why not (well,
>> ok, I can think of a few reasons). So, from the TDS side, and more
>> specifically netCDF-Java, this is a bug (well, bugs, because returning zero
>> isn't quite the thing to do here either).
>>
>> From what I can understand, netCDF-C tries to preload some data from
>> remote servers (if the variable is considered "small enough"), and because
>> the variable "project" is "small enough", the library tries to grab it all
>> and the request bombs out. You can tell the C library to not do that by
>> adding "#noprefetch" at the end of the url when opening the dataset, but
>> that just delays things (unless you or the underlying library you are using
>> never tries to fully read the variable "project"). So, in your sample
>> python code, pass Netcdf4Python's Dataset the following URL and give it a
>> spin:
>>
>>
>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml#noprefetch
>>
>> Cool...so what to do until I can get this fixed. Three things off the top
>> of my head:
>>
>> 1. Add "#noprefetch" to your URL (and anyone else using the dataset).
>> Barf.
>> 2. Use NcML in the aggregation to remove the variable "project" all
>> together, as it seems to be -999.0 when it does exist anyways.
>> 3. Add the variable "project" to the latest file (NcML or by rewriting
>> it).
>>
>> Sorry this took so long to debug. There will be a fix, it will just take
>> some time (maybe early next week since I believe I know exactly what needs
>> done and where).
>>
>> Cheers,
>>
>> Sean
>>
>>
>> On Fri, Jan 31, 2020 at 12:27 PM Spicer Bak <spicer.bak.frf@xxxxxxxxx>
>> wrote:
>>
>>> Hey Sean,
>>> Thanks for getting back to me.  I was still getting the same symptoms
>>> earlier today.  Did it work on second inquiry for you?
>>>
>>> I did make a change to that last file so the dim's match up with
>>> previous.  After I did that, checked again to make sure that wasn't causing
>>> the problem.  Seems i still am getting the same IndexError i was getting,
>>> so this (as you mentioned) is not the root of the problem.
>>>
>>> On Fri, Jan 31, 2020 at 11:03 AM Sean Arms <sarms@xxxxxxxx> wrote:
>>>
>>>> Greetings Spicer,
>>>>
>>>> It looks like you have solved the issue? I was having problems the
>>>> other day as well. I downloaded the files locally to see if I could
>>>> reproduce the issue, and I am unable to do so now. However, I do see
>>>> something that might cause an issue down the road. The dimension order on
>>>> some of the variables in the latest file does not match what is in other
>>>> files. For example, if we compare the latest file:
>>>>
>>>>
>>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/FRF_20200110_1180_FRF_NAVD88_LARC_GPS_UTC_v20200113_grid_latlon.nc.dds
>>>>
>>>> with next latest file:
>>>>
>>>>
>>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/FRF_20191206_1179_FRF_NAVD88_LARC_GPS_UTC_v20191209_grid_latlon.nc.dds
>>>>
>>>> The dimensionality of latitude, longitude, northing, and easting switch
>>>> from (xFRF, yFRF) (20191206) to (yFRF, xFRF) (20200113).Might not be a big
>>>> issue overall since the use of an NcML file on the server (as opposed to
>>>> NcML directly in the catalog) uses the first file of the aggregation as
>>>> template.
>>>>
>>>> Cheers,
>>>>
>>>> Sean
>>>>
>>>>
>>>> On Wed, Jan 29, 2020 at 3:03 PM Spicer Bak <spicer.bak.frf@xxxxxxxxx>
>>>> wrote:
>>>>
>>>>> hello TDS community,
>>>>> I have a problem I'm quite pickled on.  We had a dataset that was
>>>>> working fine until (i think) we pushed the latest file.  Our server is
>>>>> continually updated with new files and all of the other datasets seem to
>>>>> work fine.
>>>>>
>>>>> I'm able to get the data from the individual file URLS, but not the
>>>>> time concatenated ncml file.  I can display the file or variables, but not
>>>>> obtain any of the data through OPeNDAP, but i'm able to see the data
>>>>> returned via the "get ASCII" button on the OPeNDAP page.  The below python
>>>>> script demonstrates the problem:
>>>>>
>>>>> # success with ncml:
>>>>>
>>>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml.ascii?latitude%5B0:1:0%5D%5B0:1:0%5D,time%5B0:1:425%5D,elevation%5B0:1:0%5D%5B0:1:0%5D%5B0:1:0%5D
>>>>>
>>>>> # Failure with python (matlab as well)
>>>>> import netCDF4 as nc
>>>>> for url in urls:
>>>>>     print(nc.Dataset(url)['time'])
>>>>>     variables= nc.Dataset(url).variables.keys()
>>>>>     for var in variables:
>>>>>         try:
>>>>>             nc.Dataset(url)[var][0]
>>>>>             print('Success! {} from {}'.format(var, url))
>>>>>         except IndexError as e:
>>>>>             print("won't load variable {} from {}".format(var, url))
>>>>>
>>>>>             print('    {}'.format(e))
>>>>>
>>>>> Any help would be much appreciated!
>>>>>
>>>>> --
>>>>> +++++++++++++++++++++++++++
>>>>> Spicer Bak, PhD
>>>>> USACE CHL Field Research Facility
>>>>> 252-305-9975
>>>>> _______________________________________________
>>>>> NOTE: All exchanges posted to Unidata maintained email lists are
>>>>> recorded in the Unidata inquiry tracking system and made publicly
>>>>> available through the web.  Users who post to any of the lists we
>>>>> maintain are reminded to remove any personal information that they
>>>>> do not want to be made public.
>>>>>
>>>>>
>>>>> thredds mailing list
>>>>> thredds@xxxxxxxxxxxxxxxx
>>>>> For list information or to unsubscribe,  visit:
>>>>> https://www.unidata.ucar.edu/mailing_lists/
>>>>>
>>>>
>>>
>>> --
>>> +++++++++++++++++++++++++++
>>> Spicer Bak, PhD
>>> USACE CHL Field Research Facility
>>> 252-305-9975
>>>
>>
>
> --
> +++++++++++++++++++++++++++
> Spicer Bak, PhD
> USACE CHL Field Research Facility
> 252-305-9975
>
Follow-Ups:
- Re: [thredds] data won't return from
  - From: Sean Arms
References:
- [thredds] data won't return from
  - From: Spicer Bak
- Re: [thredds] data won't return from
  - From: Sean Arms
- Re: [thredds] data won't return from
  - From: Spicer Bak
- Re: [thredds] data won't return from
  - From: Sean Arms
- Re: [thredds] data won't return from
  - From: Spicer Bak
2020 messages navigation, sorted by:
1. Thread
2. Subject
3. Author
4. Date
5. ↑ Table Of Contents
Search the thredds archives: