The increasing use of machine learning tools in various industries and their potential impact on society has led to increased interest and coverage in the news. There have been significant advancements in machine learning technology over the past few years, which have resulted in the development of powerful and sophisticated tools. These tools have the ability to learn from large amounts of data and make predictions or decisions based on that data. Large language models — a type of artificial intelligence that is capable of understanding and generating human language — use deep learning techniques to analyze vast amounts of text data and learn the underlying patterns and structures of language.
Scientists can use machine learning techniques to analyze data and generate insights from the data that they have collected. The technology can help identify patterns or trends in data sets that may be difficult to detect using traditional statistical methods.
The preceding paragraphs, generated by the ChatGPT system, illustrate some of the capabilities of the artificial intelligence/machine learning (AI/ML) systems that are engendering so much excitement today. Not to be left behind, the Unidata Program Center (UPC) software development team has been searching for ways to leverage the technologies behind popular packages like ChatGPT and Stable Diffusion to help Earth Systems Sciences researchers, educators, and students reach scientific conclusions more quickly and with less effort.
“We asked ourselves, what if we could build machine learning algorithms right into our data visualization and analysis software packagaes?” says Unidata AI/ML developer Thomas Martin. “Exposing the power of these techniques for everyday use by scientists — that just seems like a great way to help them optimize their scientific efforts and get on to other projects!”
Unidata's version of the National Weather Service's Advanced Weather Interactive Processing System (AWIPS) was an obvious candidate for initial tests of the technique. Using ChatGPT-like technology, developers Tiffany Meyer and Shay Carter augmented the Unidata AWIPS distribution to generate NWS-style Watches and Warnings based entirely on eyewitness descriptions of hyperlocal conditions. For example, a human observer remarking "it's raining cats and dogs out there" produces this Warning:
A HEAVY DOWNPOUR OF CATS AND DOGS MAY CAUSE LOCALIZED FLOODING AND OBSTRUCTED VISIBILITY, POSING A RISK TO MOTORISTS AND PEDESTRIANS.
THIS TYPE OF PRECIPITATION CAN ALSO BE HARMFUL TO ANIMALS. PET OWNERS ARE ADVISED TO KEEP THEIR PETS INDOORS AND AWAY FROM WINDOWS DURING THE DURATION OF THE EVENT.
THE NATIONAL WEATHER SERVICE WILL CONTINUE TO MONITOR THE SITUATION CLOSELY AND ISSUE UPDATES AS NEEDED.
Augmenting Unidata's Integrated Data Viewer (IDV) to produce 3D renderings of hypothetical atmospheric conditions opens the door to new exploratory learning opportunities. While the effect of atmospheric rivers is first-hand knowledge for many students on the west coast of the United States, students in other regions are not so lucky. Using Stable Diffusion-like image generation in the IDV lets users in any area visualize the effects of an atmospheric river in their home states.
Unidata Director Mohan Ramamurthy is enthusiastic about building AI/ML tools directly into Unidata software packages. “By incorporating machine learning techniques into Unidata software, we can catapult our visualization tools into new and exciting realms,” he says. “These advances, coupled with the GPU capabilities of the NSF's Jetstream 2 cloud computing environment, can help our community reduce its 'Time to Science' to something approaching zero!”
The machine learning revolution at Unidata won't be limited to visualization software, either. Alongside the new software packages, the UPC will be debuting ChatUNIDATA to take the place of the existing eSupport technical support system. Now, rather than having to correspond with human software developers who have direct experience with Unidata packages, community members will get instant automated answers generated by the UPC's custom large language model, which has been trained using more than thirty years of archived support e-mail messages.
While Unidata is committed to bringing the exciting new benefits of machine learning to the community, not every aspect of our operation is changing. “We thought about relocating the Unidata Data Hallway to the Jetstream cloud, but they just don't have the sorts of hardware we're used to working with,” said Unidata systems administrator Mike Schmidt. “For now, anyway, we'll be sticking with the computing solutions that have served us for so long.”
Public versions of Unidata's AI/ML-enabled software tools are slated for release first thing on April Fool's Day, 2023.