Web Scraping Unidata Data Servers

When users set up scripts or other methods of automating the retrieval of our website’s content, it’s possible for that automation to have adverse side effects for our servers and other users. As such, we may block your access to our servers if we find such automation. We do this to ensure all users have the best experience possible.

Please note: Unidata is funded by the National Science Foundation to provide data and services to the geoscience communities for education and research. If your usage is personal, commercial, or outside Unidata's community, your access to data may be restricted. Please see our Guidelines for Data Use and Participation Policy for more information.

If your research requires you to programmatically scrape content from the web, please review the following guidelines, suggestions, and alternatives.

Rate Limits

The volume and frequency of queries you make should not burden our servers or interfere with their normal operations.

Therefore, we ask you to limit your request rate and use appropriate time intervals. If a product updates once an hour, there’s no need to check every minute.

Use HTTPS instead of HTTP

We redirect all HTTP requests to HTTPS. Making your requests directly via HTTPS will cut down on unnecessary server requests.

Pay Attention To Server Response Codes

Periodically check in on your script and perform error handling for when your request returns unexpected results. Products and data periodically change their locations, and you’ll need to update your script to reflect those changes.

Identify Yourself Via The User Agent

If possible, please specify your identity and contact information in the User-Agent HTTP Request Header. Example:

                 Company or Institution Name (admin.bob@example.com)

We regularly examine our log files looking for activity that may be impacting the performance of our servers or web services. If we encounter a problem with your client and can identify you via the User Agent information, we will reach out to you to sort out the matter in lieu of blocking you.

Alternate Sources Of Data

If you require access to large volumes of data from multiple datasets at a high frequency, web scraping is not the recommended approach for acquiring these data. In such a case, you may consider having your institution become a node for the IDD. For more information, please contact support@unidata.ucar.edu.

Questions And Getting Assistance

We monitor our web usage statistics on a regular basis. In order to best maintain the performance of servers and to ensure compliance of the above rules, violators of any of the above terms may be blocked from accessing our servers.

If you believe you may have been locked out of our website, please contact us at support@unidata.ucar.edu and we will work to restore your access.

Likewise, please reach out to us if you have any questions about optimizing your client’s requests.

We hope you understand and comply with our rules that are designed to allow everyone to continue enjoying our data servers!