This will be a challenge for sure.I think the FMRC will probably solve it. However, a 75,000 file aggregation will be a challenge. Im actually pretty sure we can solve it (with enough server memory!) but it does worry me that with a single dods call, someone could make a request that requires opening 75,0000 files to satisfy. OTOH, if thats the service you want to provide, it sure is a lot better doing it on the server!!! Any thoughts?
The NARR, for example, will be an aggregation of ~75000 grib files.
Stored in a basic ./YYYYMM/YYYYMMDD tree. The recursive datasetScan
tag added recently helps a ton with this. Some of our datasets have
forecast hours, some don't. Doing n forecast hour aggregation across
the 00hr will help termendously with all of them, however.
While it works wonderfully for NetCDF, I cannot see the NcML agg.
working with this set of data ~
mainly due to the changing reference times.
Do you store each hour seperately, or are all the forecast hours for a run in the same file?According to NCEP, our NAM & GFS will soon be foreced into GRIB2. But NCDC-NOMADS NWP it currently entirely a GRIB-1 archive. Only recently home-grown NCDC datasets are created in NetCDF.
For NAM & GFS, we have about 6 months online, which comes out to
about 700 file when stripped to a 1 forecast time
(say 00hr) aggregation. But there are 61 forecast times for GFS, and 21
Especially for GRIB files, you likely need this new "Forecast Model Run Collection Aggregation" capability. We have been working with our IDD NCEP GRIB files, and there are some complications, especially non-homogenaity due to missing records and variable time and vertical dimensions, that cant really be solved by the current (index based) aggregation. Ethan and I will work closely with you guys to get this working. I'd like to understand what you have in more detail, number and types of files, how they are stored, etc. Can you or someone summarize?
NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.