[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Dealing with large archives

Tennessee Leeuwenburg wrote:

Secondly :

I am trying to work out how to structure my data by date. I will have a number of data sets (NWP Models) which will get updated daily, or even multiple times per day. Quite quickly I will reach the point where I will have hundreds of data sets published. Even a week's worth of data at 2 per day across 3 sources is 42 data sets.

I have two tasks - one would be to automate the updating of the configuration files so that new data sets get incorporated as they become available, and the other would be structuring the data pages in a sensible way for users to access.

I was wondering what practises people might have adopted or found successful in the past with regards to handling large amounts of data? Have people typically arranged archive data as aggregations, or linked to archive catalogs from the top-level catalog? What have people found best?


We do both: top level catalogs are links to lower level catalogs, and we aggregate as much as we can, and work towards aggregating more. There is very little that you can do that is not more helpful to the user than aggregating many datasets into one. Of course whatever technology you choose to do the aggregation needs to serve quickly.


-- Dr. M. Benno Blumenthal address@hidden International Research Institute for climate prediction The Earth Institute at Columbia University Lamont Campus, Palisades NY 10964-8000 (845) 680-4450

NOTE: All email exchanges with Unidata User Support are recorded in the Unidata inquiry tracking system and then made publicly available through the web. If you do not want to have your interactions made available in this way, you must let us know in each email you send to us.