Re: [thredds] Problem between OPeNDAP and TDS when netCDF file is modified


I've deleted Claude's original post from the cascade below, and neatened up
the Subject: line, which will no doubt screw up threading.  In any case, our
web system administrator tells me that we had such a NetcdfFileCache element
all along, with maxFiles set to 0 (I don't have all of our TDS config files
in front of me, alas).  I also found, in the online documentation at:

the following:

   <maxSize>20 Mb</maxSize>

Eliminating the <dir> and <jvmPercent> elements, and setting maxSize to zero
(and, hopefully putting it in the correct TDS config file.  >SIGH<), we
restarted Tomcat.  The results were initially disenheartening, as a timestep
added to the final file an hour before was not included in the
featureCollection aggregation, but was picked up by the NcML aggregation of
the same time series.

I just checked again (about an hour later), and it's still not part of the
featureCollection aggregation.  So, we still have no solution AFAIK.  Of
course, I don't know which TDS config file our web system administrator
put the maxSize element in.  The web page above says it should have gone in:


Our web system administrator has gone home for the day, but I've asked him
in e-mail just which config file he put that element in.


On 04/04/12 14:51, Hoop wrote:
> Ethan,
> Thanks for responding.  I'm dubious that this will be effective.
> Our web system administrator looked around and found a different
> cache directory for collections.  When he cleared this out, the
> new invocation of Tomcat resulted in the missing timesteps finding
> their way into the featureCollection aggregation.  It thus strikes
> me that this is indeed the cache that needs to be cleaned our and/or
> disregarded by the aggregation-making daemon process.
> Nonetheless, we'll try it and get back to you.
> -Hoop
> ---------------------------- Original message -------------------------------
> Re: [thredds] Pb between OpenDap and THREDDS when netcdf file are modifed
>     * To: thredds@xxxxxxxxxxxxxxxx
>     * Subject: Re: [thredds] Pb between OpenDap and THREDDS when netcdf file
> are modifed
>     * From: Ethan Davis <edavis@xxxxxxxxxxxxxxxx>
>     * Date: Wed, 04 Apr 2012 13:55:46 -0600
> Hi Hoop,
> Try turning off the NetcdfFile caching in your threddsConfig.xml by
> setting NetcdfFileCache/maxFiles to zero:
>   <NetcdfFileCache>
>     <maxFiles>0</maxFiles>
>   </NetcdfFileCache>
> This will turn off the NetcdfFile cache globally but not the aggregation
> caches. There may be some performance issues in turning this off but we
> suspect that OS file caching may make it negligible.
> Let us know what you see. I'll get back to you on the XML checker stuff
> in another email.
> Ethan
> On 04/03/12 11:47, Hoop wrote:
>> Ethan,
>> Additional information:  our web system administrator checked the
>> logs, and found that the software daemon that is supposed to check
>> and rebuild the aggregation if need be was indeed running, but
>> finding nothing to do.  Worse, he restarted Tomcat, which, with
>> NcML aggregation would pick up the more recent time steps, did not
>> change things.  The time series still ends 2012/03/28, as it did
>> when I first created the featureCollection version of the
>> aggregation, even though the final file has added five time steps.
>> The NcML version of the aggregation did pick up the new time steps
>> when Tomcat was restarted.
>> Hoping for a detailed response,
>> -Hoop
>> On 04/02/12 11:39, Hoop wrote:
>>> Ethan,
>>> Well, that got me just where NcML aggregation got me: an aggregation
>>> that does not notice new timesteps added to the latest file.  It also
>>> created two new time-like variables (time_offset and time_run) and
>>> threw away most of the metadata I had for the time variable.  My only
>>> reason for using "Latest" instead letting it default to "Penultimate"
>>> was in the forlorn hope of getting my second value of the attribute
>>> time:actual_range picked up.
>>> I am still getting the same error messages from the XML checker
>>> that TDS runs on its configuration files.  I wonder if I'm ever
>>> going to hear back about this difference that makes a difference
>>> between the published XSDs and the online-documentation.  Here are
>>> the error messages:
>>> [2012-03-29T19:16:15GMT]
>>> readCatalog(): full path=/usr/share/tomcat5/content/thredds/catalog.xml;
>>> path=catalog.xml
>>> readCatalog(): valid catalog -- ----Catalog Validation version 1.0.01
>>> *** XML parser error (36:14)= cvc-complex-type.2.4.a: Invalid content
>>> was found starting with element 'filter'. One of
>>> '{"":addLatest,
>>> "":addProxies,
>>> "":addDatasetSize,
>>> "":addTimeCoverage}'
>>> is expected.
>>> *** XML parser error (54:50)= cvc-complex-type.2.4.a: Invalid content
>>> was found starting with element 'update'. One of
>>> '{"":fmrcConfig,
>>> "":pointConfig,
>>> "":netcdf}' is
>>> expected.
>>> readCatalog(): full
>>> path=/usr/share/tomcat5/content/thredds/enhancedCatalog.xml;
>>> path=enhancedCatalog.xml
>>> readCatalog(): valid catalog -- ----Catalog Validation version 1.0.01
>>> -Hoop
>>> ------ original message --------------
>>> Hi Hoop,
>>> Try adding the following to your featureCollection element
>>>   <metadata inherited="true">
>>>     <serviceName>all</serviceName>
>>>   </metadata>
>>> Also, since your most recent dataset is the one that is changing, you
>>> might want to change protoDataset@choice from "Latest" to "Penultimate"
>>> (which is the default, so you could just drop protoDataset all
>>> together). Also, since data files in your dataset don't age off, it
>>> probably isn't too important which dataset is used but probably better
>>> to not use the one that gets updated. The protoDataset is used to
>>> populate the metadata in the feature dataset.
>>> Since your datasets are a simple timeseries rather than a full-blown
>>> FMRC, you will probably want to add
>>>   <fmrcConfig datasetTypes="Best"/>
>>> The fmrcConfig@datasetTypes value tells the featureCollection which
>>> types of FMRC datasets to create. With the value "Best", the forecast
>>> types are left off and only the "Best Time Series" dataset is created.
>>> Not the best dataset name for a simple time series grid (its not just
>>> the best time series, its the only one!) but that's what we have for the
>>> moment. If you want to let people see the underlying files, you could
>>> add "Files" to the fmrcConfig@datasetTypes value.
>>> I'm including the link to the FeatureCollection tutorial [1] which I
>>> forgot to point out in an earlier email when I gave you the link to the
>>> reference docs [2].
>>> Hope that helps,
>>> Ethan
>>> [1]
>>> [2]
>>> On 3/26/2012 11:13 AM, Hoop wrote:
>>>> Ethan,
>>>> The catalog is attached.  The filter element is in a datasetScan
>>>> element that we use to generically wrap our NetCDF files, and
>>>> not included within the featureCollection element or any other
>>>> aggregation element.  It is meant to generally apply throughout our
>>>> installation.
>>>> Sample files may be obtained from:
>>>> The files for this year are updated on a daily basis, barring
>>>> problems.
>>>> Let me know what else I can do to help.
>>>> -Hoop
>>>> On 03/24/12 23:02, thredds-request@xxxxxxxxxxxxxxxx wrote:
>>>>> Send thredds mailing list submissions to
>>>>>   thredds@xxxxxxxxxxxxxxxx
>>>>> To subscribe or unsubscribe via the World Wide Web, visit
>>>>> or, via email, send a message with subject or body 'help' to
>>>>>   thredds-request@xxxxxxxxxxxxxxxx
>>>>> You can reach the person managing the list at
>>>>>   thredds-owner@xxxxxxxxxxxxxxxx
>>>>> When replying, please edit your Subject line so it is more specific
>>>>> than "Re: Contents of thredds digest..."
>>>>> thredds mailing list
>>>>> thredds@xxxxxxxxxxxxxxxx
>>>>> For list information or to unsubscribe,  visit: 
>>>>> Today's Topics:
>>>>>    5. Re: Pb between OpenDap and THREDDS when netcdf file are
>>>>>       modifed (Ethan Davis)
>>>>> ----------------------------------------------------------------------
>>>>> Message: 5
>>>>> Date: Sat, 24 Mar 2012 23:02:53 -0600
>>>>> From: Ethan Davis <edavis@xxxxxxxxxxxxxxxx>
>>>>> To: thredds@xxxxxxxxxxxxxxxx
>>>>> Subject: Re: [thredds] Pb between OpenDap and THREDDS when netcdf file
>>>>>   are modifed
>>>>> Message-ID: <4F6EA6FD.8080906@xxxxxxxxxxxxxxxx>
>>>>> Content-Type: text/plain; charset=ISO-8859-1
>>>>> Hi Hoop,
>>>>> Can you send us (or point us to) a few sample files and send us your
>>>>> full catalog?
>>>>> Is the filter you mention below part of your featureCollection element?
>>>>> Ethan
>>>>> On 3/9/2012 1:59 PM, Hoop wrote:
>>>>>> Ethan,
>>>>>> I don't believe John ever responded as you had requested.
>>>>>> I did my best to try "featureCollection", but I got nowhere.
>>>>>> It doesn't help that the XSDs specify required elements
>>>>>> (for "update" and "filter") that are not mentioned in the
>>>>>> online documentation; the validation process that TDS runs
>>>>>> at start-up informed me of those errors.  I have no clue how
>>>>>> to correct them.  Here is the attempt I made:
>>>>>> <featureCollection name="SST_NOAA_OISST_V2_HighResFC" featureType="FMRC"
>>>>>>  harvest="true" path="Datasets/aggro/">
>>>>>>  <collection
>>>>>>   spec="/Datasets/noaa.oisst.v2.highres/$"
>>>>>>   name="SST_OISST_V2_HighResFC" olderThan="15 min" />
>>>>>>  <protoDataset choice="Latest" change="0 0 7 * * ? *" />
>>>>>>  <update startup="true" rescan="0 0 * * * ? *" />
>>>>>> </featureCollection>
>>>>>> My use of "filter" is as follows:
>>>>>>      <filter>
>>>>>>         <include wildcard="*.nc"/>
>>>>>>         <exclude wildcard="*.data"/>
>>>>>>         <exclude wildcard="*.f"/>
>>>>>>         <exclude wildcard="*.gbx"/>
>>>>>>         <exclude wildcard="*.txt"/>
>>>>>>         <exclude wildcard="README"/>
>>>>>>      </filter>
>>>>>> Someone want to tell me what I did wrong in each case?
>>>>>> Thanks,
>>>>>> -Hoop
>>>>>>> -------- Original Message --------
>>>>>>> Subject:        Re: [thredds] Pb between OpenDap and THREDDS when 
>>>>>>> netcdf file are modifed
>>>>>>> Date:   Thu, 23 Feb 2012 22:03:38 -0700
>>>>>>> From:   Ethan Davis <edavis@xxxxxxxxxxxxxxxx>
>>>>>>> To:     thredds@xxxxxxxxxxxxxxxx
>>>>>>> Hi Hoop,
>>>>>>> The dynamic dataset handling in the NcML aggregation code was designed
>>>>>>> to deal with the appearance of new datasets more than data being
>>>>>>> appended to existing datasets. The NcML aggregations are also limited to
>>>>>>> straight forward aggregations based on homogeneity of dimensions and
>>>>>>> coordinate variables; they don't use any coordinate system or higher
>>>>>>> level feature information that might be available. This makes straight
>>>>>>> NcML aggregation somewhat fragile and hard to generalize to more complex
>>>>>>> situations.
>>>>>>> FeatureCollections are designed to use the CDMs understanding of
>>>>>>> coordinate systems and feature types to both simplify configuration and
>>>>>>> make aggregations more robust and general.
>>>>>>> While the FMRC collection capability was designed for a time series of
>>>>>>> forecast runs, I believe it should handle a simple time series of grids
>>>>>>> as well. (John, can you add more information on this?)
>>>>>>> Ethan
>>>>>>> On 2/23/2012 3:21 PM, Hoop wrote:
>>>>>>>> Ethan,
>>>>>>>> This reminds me of an issue we are having, with version 4.2.7.
>>>>>>>> Here is the relevant snippet from our config:
>>>>>>>> <dataset name="SST NOAA OISST V2 HighRes" ID="SST_OISST_V2_HighRes"
>>>>>>>>     urlPath="Datasets/aggro/" serviceName="odap" 
>>>>>>>> dataType="grid">
>>>>>>>>     <netcdf 
>>>>>>>> xmlns="";>
>>>>>>>>         <aggregation dimName="time" type="joinExisting" 
>>>>>>>> recheckEvery="15 min">
>>>>>>>>             <scan location="/Projects/Datasets/noaa.oisst.v2.highres/"
>>>>>>>>                   regExp="sst\.day\.mean\.....\.v2\.nc$" 
>>>>>>>> subdirs="false"/>
>>>>>>>>         </aggregation>
>>>>>>>>     </netcdf>
>>>>>>>> </dataset>
>>>>>>>> The behavior we are getting in our time series, which is based on
>>>>>>>> NetCDF files with a year's worth of time steps (or less), is as 
>>>>>>>> follows:
>>>>>>>> In between re-boots of Tomcat, new time steps added to the latest file
>>>>>>>> are not added to the aggregation.  However, if the calendar marches 
>>>>>>>> along
>>>>>>>> and a new file for a new year is added to our archive without rebooting
>>>>>>>> Tomcat, the timesteps for the new file are added, without the ones that
>>>>>>>> would complete the previous year, resulting in a discontinuity along 
>>>>>>>> the
>>>>>>>> time axis.  And someone somewhere may e-mail us complaining that our
>>>>>>>> OPeNDAP object is not CF-compliant because the time steps aren't all of
>>>>>>>> the same size.  %}
>>>>>>>> I looked at the featureCollection documentation link you gave, but 
>>>>>>>> since
>>>>>>>> our data are not forecasts, nor point data, nor in GRIB2 format, that
>>>>>>>> didn't seem the right fit.  Maybe I'm wrong; I'm severely 
>>>>>>>> sleep-deprived
>>>>>>>> right now....
>>>>>>>> We also have some time series in monthly files (to keep the individual
>>>>>>>> file size under 2 Gbytes).  We have not tried aggregating any of those
>>>>>>>> time series.  Could be an interesting challenge.
>>>>>>>> Thanks for any help.
>>>>>>>> -Hoop
>>>>>>> _______________________________________________
>>>>>>> thredds mailing list
>>>>>>> thredds@xxxxxxxxxxxxxxxx
>>>>>>> For list information or to unsubscribe,  visit: 
>>>>>> _______________________________________________
>>>>>> thredds mailing list
>>>>>> thredds@xxxxxxxxxxxxxxxx
>>>>>> For list information or to unsubscribe,  visit: 

  • 2012 messages navigation, sorted by:
    1. Thread
    2. Subject
    3. Author
    4. Date
    5. ↑ Table Of Contents
  • Search the thredds archives: