Greetings Spicer,
I dug into this more over the weekend, and it turns out two files are
missing the project variable:
FRF_20161116_1128_FRF_NAVD88_LARC_GPS_UTC_v20170320_grid_latlon.nc
FRF_20200110_1180_FRF_NAVD88_LARC_GPS_UTC_v20200113_grid_latlon.nc
If you add a project variable to those files, the aggregation works (tested
locally with your files, ncml, and original python code).
One thing I noticed - there are several files with the same time value, so
in the aggregation you end up with duplicate time values without a way for
users to distinguish where they came from (i.e. which version). A list of
those files are at the end of this message.
Cheers!
Sean
Files with the same time:
FRF_19950420_0743_FRF_NAVD88_CRAB_Geodimeter_UTC_v20151115_grid_latlon.nc,
FRF_19950420_0743_FRF_NAVD88_CRAB_Geodimeter_UTC_v20190326_grid_latlon.nc
FRF_20150429_1100_FRF_NAVD88_LARC_GPS_UTC_v20160323_grid_latlon.nc,
FRF_20150429_1100_FRF_NAVD88_LARC_GPS_UTC_v20190326_grid_latlon.nc
FRF_20150618_1102_FRF_NAVD88_LARC_GPS_UTC_v20170328_grid_latlon.nc,
FRF_20150618_1102_FRF_NAVD88_LARC_GPS_UTC_v20190326_grid_latlon.nc
FRF_20151014_1108_FRF_NAVD88_LARC_GPS_UTC_v20170328_grid_latlon.nc,
FRF_20151014_1108_FRF_NAVD88_LARC_GPS_UTC_v20190330_grid_latlon.nc
FRF_20151221_1115_FRF_NAVD88_LARC_GPS_UTC_v20170320_grid_latlon.nc,
FRF_20151221_1115_FRF_NAVD88_LARC_GPS_UTC_v20190326_grid_latlon.nc
FRF_20160817_1122_FRF_NAVD88_LARC_GPS_UTC_v20170320_grid_latlon.nc,
FRF_20160817_1122_FRF_NAVD88_LARC_GPS_UTC_v20190326_grid_latlon.nc
FRF_20160926_1124_FRF_NAVD88_LARC_GPS_UTC_v20170320_grid_latlon.nc,
FRF_20160926_1124_FRF_NAVD88_LARC_GPS_UTC_v20190330_grid_latlon.nc
FRF_20161003_1125_FRF_NAVD88_LARC_GPS_UTC_v20170320_grid_latlon.nc,
FRF_20161003_1125_FRF_NAVD88_LARC_GPS_UTC_v20190330_grid_latlon.nc
FRF_20161020_1126_FRF_NAVD88_LARC_GPS_UTC_v20170320_grid_latlon.nc,
FRF_20161020_1126_FRF_NAVD88_LARC_GPS_UTC_v20190330_grid_latlon.nc
FRF_20161116_1128_FRF_NAVD88_LARC_GPS_UTC_v20170320_grid_latlon.nc (also
missing project variable),
FRF_20161116_1128_FRF_NAVD88_LARC_GPS_UTC_v20190330_grid_latlon.nc
FRF_20170105_1129_FRF_NAVD88_LARC_GPS_UTC_v20170320_grid_latlon.nc,
FRF_20170105_1129_FRF_NAVD88_LARC_GPS_UTC_v20190326_grid_latlon.nc
FRF_20171011_1142_FRF_NAVD88_LARC_GPS_UTC_v20171012_grid_latlon.nc,
FRF_20171011_1143_FRF_NAVD88_LARC_GPS_UTC_v20171221_grid_latlon.nc
FRF_20171121_1143_FRF_NAVD88_LARC_GPS_UTC_v20171129_grid_latlon.nc,
FRF_20171121_1144_FRF_NAVD88_LARC_GPS_UTC_v20171221_grid_latlon.nc,
FRF_20171121_1144_FRF_NAVD88_LARC_GPS_UTC_v20180130_grid_latlon.nc
FRF_20180418_1149_FRF_NAVD88_LARC_GPS_UTC_v20180427_grid_latlon.nc,
FRF_20180418_1149_FRF_NAVD88_LARC_GPS_UTC_v20190326_grid_latlon.nc
FRF_20190917_1170_FRF_NAVD88_CRAB_GPS_UTC_v20190919_grid_latlon.nc,
FRF_20190917_1170_FRF_NAVD88_CRAB_GPS_UTC_v20191029_grid_latlon.nc
On Fri, Jan 31, 2020 at 3:42 PM Sean Arms <sarms@xxxxxxxx> wrote:
> Greetings Spicer,
>
> I think there is an issue with your new project variable. In previous
> files, it's a float,
>
>
> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/FRF_20191206_1179_FRF_NAVD88_LARC_GPS_UTC_v20191209_grid_latlon.nc.ascii?project%5B0:1:0%5D
>
> Dataset {
> Float64 project[time = 1];
> }
> frf/geomorphology/DEMs/surveyDEM/FRF_20191206_1179_FRF_NAVD88_LARC_GPS_UTC_v20191209_grid_latlon.nc;
> ---------------------------------------------
> project[1]
> -999.0
>
> but in the new latest file, it's a string:
>
>
> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/FRF_20200110_1180_FRF_NAVD88_LARC_GPS_UTC_v20200113_grid_latlon.nc.ascii?project
>
> Dataset {
> String project;
> }
> frf/geomorphology/DEMs/surveyDEM/FRF_20200110_1180_FRF_NAVD88_LARC_GPS_UTC_v20200113_grid_latlon.nc;
> ---------------------------------------------
> project, "F"
>
> That might cause new kinds of issues for a full variable read.
>
> Cheers!
>
> Sean
>
>
> On Fri, Jan 31, 2020 at 3:33 PM Spicer Bak <spicer.bak.frf@xxxxxxxxx>
> wrote:
>
>> Hey Sean,
>> Glad we were able to help find that bug, but I don't think the "project"
>> variable (or lack of) is the root of our problem as i chose your option 3
>> (my mistake, this was supposed to be the same after the last one) and i
>> have similar response. Good news, when i add the #noprefetch option, it
>> seems to fix it. hopefully this helps provide answers. Demonstrated by
>> below code.
>>
>> # Failure with python (matlab as well)
>> import netCDF4 as nc
>> for url in urls:
>> print(nc.Dataset(url)['time'])
>> variables= nc.Dataset(url).variables.keys()
>> for var in variables:
>> try:
>> nc.Dataset(url)[var][0]
>> print('Success! {} from {}'.format(var, url))
>> except IndexError:
>> print("won't load variable {} from {}".format(var, url))
>> url += "#noprefetch"
>> nc.Dataset(url)[var][0]
>> print('Success! {} from {}'.format(var, url))
>> except IndexError as e:
>> print("FAIL: load variable {} from {}".format(var, url))
>>
>> print(' {}'.format(e))
>>
>>
>>
>> On Fri, Jan 31, 2020 at 4:25 PM Sean Arms <sarms@xxxxxxxx> wrote:
>>
>>> I thought it was working for me, but for the wrong reasons. Sorry about
>>> that. But, now I have it.
>>>
>>> The error message from the server is...well...garbage. That's something
>>> we need to look into. The underlying problem here is that the new data file
>>> does not include the variable project. Because variable "project" exists in
>>> the other files (or more specifically, the first file of the aggregation),
>>> and specifically because it has time as it's outer dimension, currently it
>>> must exist in all files of the aggregation. So, we can make an ascii
>>> request for that variable specifically, and request everything up to the
>>> last value, and it works [0:1:424]:
>>>
>>>
>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml.ascii?project%5B0:1:424%5D
>>>
>>>
>>> However, once we ask for the last value, it bombs [0:1:424] :
>>>
>>>
>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml.ascii?project%5B0:1:425%5D
>>>
>>> It's only like this for the full read code path, though. If I slice it
>>> from [1:1:424] (skip the first value), it works
>>>
>>>
>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml.ascii?project%5B1:1:425%5D
>>>
>>> ...and the missing value gets filled with a zero, because why not (well,
>>> ok, I can think of a few reasons). So, from the TDS side, and more
>>> specifically netCDF-Java, this is a bug (well, bugs, because returning zero
>>> isn't quite the thing to do here either).
>>>
>>> From what I can understand, netCDF-C tries to preload some data from
>>> remote servers (if the variable is considered "small enough"), and because
>>> the variable "project" is "small enough", the library tries to grab it all
>>> and the request bombs out. You can tell the C library to not do that by
>>> adding "#noprefetch" at the end of the url when opening the dataset, but
>>> that just delays things (unless you or the underlying library you are using
>>> never tries to fully read the variable "project"). So, in your sample
>>> python code, pass Netcdf4Python's Dataset the following URL and give it a
>>> spin:
>>>
>>>
>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml#noprefetch
>>>
>>> Cool...so what to do until I can get this fixed. Three things off the
>>> top of my head:
>>>
>>> 1. Add "#noprefetch" to your URL (and anyone else using the dataset).
>>> Barf.
>>> 2. Use NcML in the aggregation to remove the variable "project" all
>>> together, as it seems to be -999.0 when it does exist anyways.
>>> 3. Add the variable "project" to the latest file (NcML or by rewriting
>>> it).
>>>
>>> Sorry this took so long to debug. There will be a fix, it will just take
>>> some time (maybe early next week since I believe I know exactly what needs
>>> done and where).
>>>
>>> Cheers,
>>>
>>> Sean
>>>
>>>
>>> On Fri, Jan 31, 2020 at 12:27 PM Spicer Bak <spicer.bak.frf@xxxxxxxxx>
>>> wrote:
>>>
>>>> Hey Sean,
>>>> Thanks for getting back to me. I was still getting the same symptoms
>>>> earlier today. Did it work on second inquiry for you?
>>>>
>>>> I did make a change to that last file so the dim's match up with
>>>> previous. After I did that, checked again to make sure that wasn't causing
>>>> the problem. Seems i still am getting the same IndexError i was getting,
>>>> so this (as you mentioned) is not the root of the problem.
>>>>
>>>> On Fri, Jan 31, 2020 at 11:03 AM Sean Arms <sarms@xxxxxxxx> wrote:
>>>>
>>>>> Greetings Spicer,
>>>>>
>>>>> It looks like you have solved the issue? I was having problems the
>>>>> other day as well. I downloaded the files locally to see if I could
>>>>> reproduce the issue, and I am unable to do so now. However, I do see
>>>>> something that might cause an issue down the road. The dimension order on
>>>>> some of the variables in the latest file does not match what is in other
>>>>> files. For example, if we compare the latest file:
>>>>>
>>>>>
>>>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/FRF_20200110_1180_FRF_NAVD88_LARC_GPS_UTC_v20200113_grid_latlon.nc.dds
>>>>>
>>>>> with next latest file:
>>>>>
>>>>>
>>>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/FRF_20191206_1179_FRF_NAVD88_LARC_GPS_UTC_v20191209_grid_latlon.nc.dds
>>>>>
>>>>> The dimensionality of latitude, longitude, northing, and easting
>>>>> switch from (xFRF, yFRF) (20191206) to (yFRF, xFRF) (20200113).Might not
>>>>> be
>>>>> a big issue overall since the use of an NcML file on the server (as
>>>>> opposed
>>>>> to NcML directly in the catalog) uses the first file of the aggregation as
>>>>> template.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Sean
>>>>>
>>>>>
>>>>> On Wed, Jan 29, 2020 at 3:03 PM Spicer Bak <spicer.bak.frf@xxxxxxxxx>
>>>>> wrote:
>>>>>
>>>>>> hello TDS community,
>>>>>> I have a problem I'm quite pickled on. We had a dataset that was
>>>>>> working fine until (i think) we pushed the latest file. Our server is
>>>>>> continually updated with new files and all of the other datasets seem to
>>>>>> work fine.
>>>>>>
>>>>>> I'm able to get the data from the individual file URLS, but not the
>>>>>> time concatenated ncml file. I can display the file or variables, but
>>>>>> not
>>>>>> obtain any of the data through OPeNDAP, but i'm able to see the data
>>>>>> returned via the "get ASCII" button on the OPeNDAP page. The below
>>>>>> python
>>>>>> script demonstrates the problem:
>>>>>>
>>>>>> # success with ncml:
>>>>>>
>>>>>> https://chldata.erdc.dren.mil/thredds/dodsC/frf/geomorphology/DEMs/surveyDEM/surveyDEM.ncml.ascii?latitude%5B0:1:0%5D%5B0:1:0%5D,time%5B0:1:425%5D,elevation%5B0:1:0%5D%5B0:1:0%5D%5B0:1:0%5D
>>>>>>
>>>>>> # Failure with python (matlab as well)
>>>>>> import netCDF4 as nc
>>>>>> for url in urls:
>>>>>> print(nc.Dataset(url)['time'])
>>>>>> variables= nc.Dataset(url).variables.keys()
>>>>>> for var in variables:
>>>>>> try:
>>>>>> nc.Dataset(url)[var][0]
>>>>>> print('Success! {} from {}'.format(var, url))
>>>>>> except IndexError as e:
>>>>>> print("won't load variable {} from {}".format(var, url))
>>>>>>
>>>>>> print(' {}'.format(e))
>>>>>>
>>>>>> Any help would be much appreciated!
>>>>>>
>>>>>> --
>>>>>> +++++++++++++++++++++++++++
>>>>>> Spicer Bak, PhD
>>>>>> USACE CHL Field Research Facility
>>>>>> 252-305-9975
>>>>>> _______________________________________________
>>>>>> NOTE: All exchanges posted to Unidata maintained email lists are
>>>>>> recorded in the Unidata inquiry tracking system and made publicly
>>>>>> available through the web. Users who post to any of the lists we
>>>>>> maintain are reminded to remove any personal information that they
>>>>>> do not want to be made public.
>>>>>>
>>>>>>
>>>>>> thredds mailing list
>>>>>> thredds@xxxxxxxxxxxxxxxx
>>>>>> For list information or to unsubscribe, visit:
>>>>>> https://www.unidata.ucar.edu/mailing_lists/
>>>>>>
>>>>>
>>>>
>>>> --
>>>> +++++++++++++++++++++++++++
>>>> Spicer Bak, PhD
>>>> USACE CHL Field Research Facility
>>>> 252-305-9975
>>>>
>>>
>>
>> --
>> +++++++++++++++++++++++++++
>> Spicer Bak, PhD
>> USACE CHL Field Research Facility
>> 252-305-9975
>>
>