[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[THREDDS #QNQ-559947]: Ncml at reinit



Hi Jonathan,

> > Why do you need to reload the catalogs? Sorry if I'm forgetting details
> > we've already discussed.
> > Are you aggregating datasets? Is the problem something that the
> > "recheckEvery" attribute on the "aggregation"
> > element won't handle?
> 
> Because our models have different run frequency, some of them make outputs
> each days, others every 3 hours ...
> 
> Actually, we don't use the scan capabilities because we need to generate the
> intermediary ncml parts of the catalogs for
> use in another project, that's why we will generate only statics catalogs
> (all files explicitly defined / no scan).
> 
> Our needs
> 
> We have differents products (models outputs) we would like to put into
> Thredds' catalogs. For each of these products,
> we want to see (access) them througth differents views :
> - A "run view" presenting each files independently, classed by run time =>
> one run == one dataset
> - A "5D view" presenting all the files as one huge dataset with a newly
> created dimension : the run time (aggregation type joinNew)
> - A "best estimate view" made with aggregation type joinExisting :
> For example, we have some netcdf files containing 24 hour of analysis and
> 48 hour of forecast, and
> we want to aggregate only the 24 first hours (analysis time) of each file
> and the last run file entirely (analysis + forecast).
> 
> We succeed to produce these three views, but for a last one, the "forecast
> offset date", we don't know how to do because
> it seems not possible to extract some specified data from a netcdf file, I
> mean for example taking only the values corresponding
> to a specified time value (make a selection of all variables for time=1,
> time=24, time=48 ... to present the data per forecast hour)

That is right. You can't select one coordinate point from a dataset in an 
aggregation.

> I know that there is the datasetFmrc capability which in facts generate the
> needed views but unfortunatly, as I wrote above,
> we need static catalogs with as much metadata as possible, and we are not
> able to extract the forecast time from the file names
> for some of our products. (because their name does not belong to the
> datetime marker or the run or forecast time is placed in the
> directory name ...)

I don't understand your metadata restriction. Metadata can be inherited by 
lower level collections. So, you can add metadata to the datasetFmrc and it 
will be in all the sub-collections. Is there a lot of metadata unique to the 
FMRC sub-collections? That is something we have talked about doing but is not 
there yet.
 
 
> > We have seen similar issues having to do with the JVM PermGen memory. Are
> you getting "java.lang.OutOfMemoryError: PermGen
> > space" messages? If so, try increasing the PermGen space with the
> "-XX:MaxPermSize".
> 
> Ok, I will try that.
> 
> For the moment (until ncml parts are not reloaded when using
> thredds/debug/reinit), we will restart Thredds from the Tomcat Manager
> interface. I wonder if someone requesting the server at the same time (when
> restarting) would get an error because Thredds is turned off and then on.

Yes, I believe if someone hits when the TDS is turned off they will get an 
error. 
 
> Some of the questions I asked in the mail before are reproduced in that
> mail, espacially the "selection of parts of netcdf files",
> do you have an idea about that ?
> (one solution would be to write all the variables, dimensions and values
> directly into the catalog but
> I think it is a really bad solution because of the files size which can be
> greater than 100Mo)

I don't see a solution for this at the moment. Other than writing new netCDF 
files with just the desired data in them. But that kind of defeats the purpose 
of aggregation.

Ethan

Ticket Details
===================
Ticket ID: QNQ-559947
Department: Support THREDDS
Priority: Normal
Status: Open