NEXRAD Archive data available on Amazon S3

Katrina NEXRAD
Hurricane Katrina

The Big Data Project (BDP) is an initiative undertaken by the National Oceanic and Atmospheric Administration (NOAA) to increase public availability of large volumes of environmental data collected and generated by the agency. As part of the Big Data Project, Unidata is working in collaboration with Amazon Web Services (AWS) on a demonstration project to provide access to a more than twenty years of archived NEXRAD Level II radar data — augmented continuously with new, real-time data — stored in Amazon's Simple Storage Service (S3) environment. In addition to assisting AWS with ingesting new data flowing from the NEXRAD sites, Unidata Program Center staff have set up a THREDDS Data Server in the AWS environment to provide services allowing community access to the stored data.

About the Big Data Project

According to NOAA's BDP web page, “the Big Data Project is an innovative approach to publishing NOAA's vast data resources and positioning them near cost-efficient high performance computing, analytic, and storage services provided by the private sector.” In practice, this means that NOAA is making selected data assets available for five “Infrastructure as a Service” (IaaS) providers to upload to their cloud systems if they choose: Amazon Web Services (AWS), Google, IBM, Microsoft, and the Open Cloud Consortium. NOAA will continue to provide public access to the data via its traditional mechanisms as well.

What Data are Available

The project data collection consists of NEXRAD Level II radar data collected between 1991-2015, stored at NOAA's National Centers for Environmental Information (formerly the National Climatic Data Center). The data set consists of more than 250 TB of compressed data (1 Petabyte uncompressed), approximately half of which was stored on magnetic tape. The complete archive is now available on AWS; transfers to some of the other IaaS providers are still in progress.

In addition to the archive data, new Level II data are being added to the collection in near real time. NEXRAD Level II scans are performed continuously at 160 radar sites in North America. At each radar site, as each “chunk” (100 radial degrees, 1 tilt) of a scan is completed, the data is distributed in via Unidata's Local Data Manager (LDM) technology to subscribing sites. As part of this project, the individual chunks are delivered to AWS and stored temporarily in an S3 bucket, awaiting the remaining chunks that comprise the full 3-dimensional volume scan. Once all of the chunks that make up one scan are determined to be present, the chunks are combined into an aggregate volume dataset and stored permanently in the collection S3 bucket.

Accessing the Data via TDS

Members of Unidata's university community can access the collection via this THREDDS Data Server: http://thredds-aws.unidata.ucar.edu/thredds/catalog.html
(To connect using the IDV, substitute catalog.xml for catalog.html when entering the URL in the Data Chooser.)

We encourage community members to experiment with accessing the collection via the TDS. Note, however, that because this is a demonstration project, we cannot guarantee long-term access to the server. Similarly, because Unidata has limited resources available for this demonstration, access to this particular TDS is restricted to connections from .edu domains.

Accessing the Data in the Amazon S3 Environment

For those comfortable with the AWS environment, access to the collection S3 bucket is unrestricted. If you have an appropriate client, you can connect to the S3 bucket using this URL: http://noaa-nexrad-level2.s3.amazonaws.com/
Inside the S3 bucket, data are stored in the following format described in this document.

Those who can create an AWS EC2 instance in the US East AWS zone can mount the archive S3 bucket directly as described in the Amazon EC2 documentation for S3.

Additionally, those who are interested in the fastest access to the chunked data before it is aggregated into a 3D volume scan can connect to this URL:
http://unidata-nexrad-level2-chunks.s3.amazonaws.com/
or mount the temporary S3 bucket directly as described in the Amazon EC2 documentation for S3. Inside the S3 bucket, data are stored in the following format described in this document. Note that the chunked radar data only persists in this S3 bucket for a maximum of 24 hours before being scrubbed.

Unidata community members who run into issues accessing the AWS NEXRAD archive are encouraged to contact Unidata support for assistance. Additional details regarding this AWS Public Data Set, including links to several tutorials on accessing the data, are available in this post on the Amazon Web Services blog.

Access using Python

Unidata developer Ryan May has created a Jupyter (formerly iPython) notebook to demonstrate how to access the THREDDS Data Server (TDS) instance that is serving up archived NEXRAD Level II data hosted on Amazon S3. Check out Using Python to Access NCEI Archived NEXRAD Level 2 Data for details.

Comments:

Something is badly broken. Once you "visit" a year, you can't return to it later. The "icon 2006/" becomes just "2006". Clicking on it returns you to a download page.

Anything that something under "Last Modified", including "index.html" has this problem.

OS is Linux/CentOS
Browser is FireFox

Posted by Kevin Thomas on October 28, 2015 at 03:32 PM MDT #

Thanks for the report! There was a bug THREDDS' S3 code, which is now resolved. Please let us know if you find any more problems.

Posted by Ryan May on October 28, 2015 at 05:28 PM MDT #

Will others beyond the Members of Unidata's university community be able to access the collection via this THREDDS Data Server?

Posted by George Percivall on October 29, 2015 at 08:19 AM MDT #

Short answer: not in the near term.

The initial configuration of the THREDDS Data Server in the AWS cloud does not require the data user to have an AWS account. As a result, charges for data egress via the TDS are billed to Unidata's AWS account. While Amazon has generously provided Unidata with account credit to cover these costs during the demonstration, Unidata does not currently have funding to cover the data retrieval costs in a general way.

There are several possibilities for changes to this situation in the future. One would be to develop a mechanism whereby data users cover their own data retrieval costs. Alternately, Unidata or some other organization could secure funding to provide an open-access TDS server for the NEXRAD data without direct charges to the data user. From Unidata's perspective, an important part of this demonstration project is an investigation into usage patterns within our core university community and the economics of providing public access to data in a commercial cloud environment.

Posted by Unidatanews on October 29, 2015 at 08:40 AM MDT #

Ryan...

I can no longer replicate the problem. Thanks for the quick fix!

Posted by Kevin Thomas on November 03, 2015 at 02:43 PM MST #

There are several possibilities for changes to this situation in the future.

Posted by alfalah12345 on December 29, 2015 at 03:44 AM MST #

In this pdf, it says everything about how to get data, but does not specify just where to find the values for the variables.

For example, to get a 3D volume scan you go here: http://noaa-nexrad-level2.s3.amazonaws.com/Year/Month/Day/NEXRAD Station/filename

All values are self-explanatory except filename - is the name of the file containing the data Where do you get the filename at? Without it, you get this:

<Error>
<Code>NoSuchKey</Code>
<Message>The specified key does not exist.</Message>
<Key>2016/08/17/KRLX/</Key>
<RequestId>1DA700E4BF35F5F7</RequestId>
<HostId>
LapWfi97TaSiF6SWVlL5bc0JqTwVeY/miTJjlq023AqkqeUxuZT95u8qvJHpMoihICkJclw0Y50=
</HostId>
</Error>

Same question goes for the chunk files where do you get the values for:

YYYYMMDD is the date of the volume scan
HHMMSS is the time of the volume scan
CHUNKNUM is the chunk number
CHUNKTYPE is the chunk type

Thanks!

Posted by Jonathan on August 18, 2016 at 12:26 AM MDT #

Hi Jonathan,

Yes, that would be quite difficult to know in advance the chunk numbers, chunktypes, or seconds of the volume scan :)

AWS has provided a very nice service to browse the bucket and download files. the NEXRAD files are stored in year/month/day directories by station and are easily navigated using the tool found here:

https://s3.amazonaws.com/noaa-nexrad-level2/index.html

or if from a .edu domain, you can use our THREDDS server at:

http://thredds-aws.unidata.ucar.edu/thredds/catalog.html (.xml for clients)

I hope this makes accessing the NEXRAD files easier for you.

Jeff

Posted by Jeff Weber on August 19, 2016 at 10:42 AM MDT #

Hi Jeff,

I did stumble upon that; however, I was hoping to set up a real-time feed to the latest data and I'm not on a .edu domain. I can't find anywhere that spits out this information programmatically, for example, accessing via a python script.

Posted by Jonathan on August 22, 2016 at 06:58 AM MDT #

Post a Comment:
  • HTML Syntax: Allowed
News@Unidata
News and information from the Unidata Program Center
News@Unidata
News and information from the Unidata Program Center

Welcome

FAQs

Developers’ blog

Recent Entries:
Take a poll!

What if we had an ongoing user poll in here?

Browse By Topic
Browse by Topic
« August 2016
SunMonTueWedThuFriSat
 
2
4
5
7
8
10
11
12
13
14
15
16
17
18
19
20
21
22
24
25
27
28
29
30
31
   
       
Today