Welcome back to AWIPS Tips! It has been a while since we have talked about the backend portion of AWIPS: EDEX. For a brief introduction, please check out this previous entry of AWIPS Tips about EDEX. Today we are going to cover the important topic of data retention. Being mindful about data retention is important, because data can take up a significant amount of space on your server. Sometimes, that is what is expected and desired – in the case of having a long archive (longer than 5 days perhaps). But, if we’re unprepared for the amount of data that will be kept, it can end up causing all kinds of problems and may not be immediately obvious that the root of the problem is your machine running out of disk space.
There are two possible places where data is stored and can pile up on your EDEX machine – raw data and processed data. Both of these archives have different purging mechanisms that can be tweaked to your preference.
Raw data is what comes directly from the LDM and gets stored on the machine before EDEX has ever “touched” it. This data is stored in the following directory:
There can be many subdirectories in this folder, depending on what data you are pulling from the LDM. Raw data retention is regulated by the following file:
This file has a flag for the defaultRetentionHours and many flags for selectedRetentionHours. The data in data_store can be broken up into categories, and for each of those categories, there will be a selectedRetentionHours flag that applies specifically to that data. For example: there is a model category that is defined in the default RAW_DATA.xml file we distribute with our EDEX installation. The defaultRetentionHours flag applies to all data that does not fall under a specific category.
Additional purging of the raw data is also done by an LDM utility called scour. Scour frequency can only be defined down to the day (not hour), and sometimes that is still not frequent enough to avoid a disk from filling up. Scour runs from a crontab entry, so even when EDEX is down it will still run, which can be helpful if LDM is running by itself.
NOTE: If you have a distributed EDEX installation, you may also need to set up additional raw data purging mechanisms to avoid overflow.
Processed data is data that has been decoded and ingested by EDEX, and is stored on the server for access by CAVE and Python-AWIPS. For this data, the method of retention and purging is managed by xml files in the following directory:
Similar to the raw data, there are mechanisms for default retention, and for retention of specific data types. There is a defaultPurgeRules.xml file that applies to all data that does not have a specific rule established. All other files are for specific data types, i.e. gridPurgeRules.xml for model data, satellitePurgeRules.xml for GOES and other satellite imagery, etc.
Each of the individual purge rules xml files can have a default rule, as well as additional rules for more specific data, designated by key and keyValue matchings. Retention for processed data can be defined with time-based or frame-based rules.
Time-based rules are defined with a DD-HH:MM:SS period, which means you can decide to keep data around for x-amount of days or x-amount of hours. They can look like this:
Frame-based rules are defined with a number of versionsToKeep, meaning you can decide to keep the last 30 model runes, etc. They can look like this:
TIP: If you wish to simply keep all data for a given amount of time (ex: 5 days), you can simply move/remove all purge rule xml files and leave just the defaultPurgeRules.xml with it set to save data for 5 days. We recommend moving the existing purge rule files into a folder in that directory, so that you can reference them if you ever want to change your purge settings.
Adjusting the retention and purging of your EDEX server can be critical to having a system that runs efficiently, while still providing the data your users need. We hope you found this installation of AWIPS Tips helpful. Be sure to check back in two weeks for the next blog post, about NUCAPS Soundings in CAVE.
To view archived blogs, visit the AWIPS Tips blog tag, and get notified of the latest updates from the AWIPS team by signing up for the AWIPS mailing list. Questions or suggestions for the team on future topics? Let us know at email@example.com