Why are GPUs Exciting for Machine Learning Research?

NVIDIA A100 graphics card
A high-end GPU (click to expand)

Machine Learning systems are often configured around Graphics Processing Units (GPUs) rather than Central Processing Units (CPUs). Why should this be the case, in an era when CPUs are powerful and (relatively) inexpensive?

First, let's discuss what a GPU is. A GPU is a specialized piece of hardware that was initially used to process visual information and drive the data needed for large, high resolution monitors. Historically, an upgraded GPU was needed for 3D game rendering and engineering/architectural modeling or data visualization. Now we can use this specialized hardware to perform certain types of math very efficiently and at scale, specifically for machine learning. In previous roles, I would fill up large CPU clusters without any GPUs to do subsurface geophysical modeling; now we can do similar modeling at scale on a handful of GPUs. Today, GPU manufacturers are creating GPUs built specifically for machine learning and modeling, as opposed to visual rendering tasks.

graph comparing speedup with different GPUs
Relative speedups for different GPUs.

For large datasets, the speed increase provided by even a relatively inexpensive GPU over a beefy server CPU can be impressive. A technique called gradient boosting is particularly useful for predictive models that analyze ordered (continuous) data. The figure at right compares the speedup provided over a powerful Intel Xeon E5-2660v4 CPU-based server (with 56 logical cores and 512GB of RAM) by different CPUs ranging from the 300 USD NVIDIA K40 to the roughly 14000 USD NVIDA V100. You can read the blog post from NVIDIA for more details, but the speed improvements for machine learning tasks are significant.

Can you use any GPU for machine learning?

Short answer: No. Most machine learning software support is designed to take advantage of relatively new NVIDIA products. Installing NVIDIA/CUDA drivers is not for the faint of heart; this is one reason (besides cost!) to use a cloud service. A high-end NVIDIA GPU can cost around 1000 USD, but a year of Google Colab (see below) is 120 USD. Relatively new Apple products have GPUs that can be accessed with special software, but these workflows and use cases are far from ubiquitous.

Does every project need GPUs?

For many projects with less than a million data points using scikit-learn type of models, there is no need to use a GPU. Larger or image-based datasets could see a significant improvement in runtime or useability when using a GPU. Using Google Colab or other platforms, anyone can learn how to use a GPU for smaller problems.

Python Packages that use GPU capabilities

While the landscape is changing fast, the main packages that I use that support using a GPU are below:

Access to Cloud GPUs

If you do not have access to local hardware with a modern GPU, there are various commercial cloud hosted products that can help with access. Google Colaboratory (Colab) is the service I recommend. They have a free version suitable for many tasks; their least expensive paid tier (Colab Pro) costs 10 USD per month. Paperspace, AWS Sagemaker, Azure ML are other similar products that use a Jupyter Notebook interface.

Through Unidata's Science Gateway, hosted on the NSF-funded Jetstream2 Cloud, Earth Systems Science professionals, educators, and students can access a GPU enabled JupyterHub server. For more information on how to request a free dedicated server, please contact Unidata Science Gateway staff at support-gateway@unidata.ucar.edu. We’d love to hear about your exciting new projects!

Other Resources:

Thomas Martin is an AI/ML Software Engineer at the Unidata Program Center. Have questions? Contact support-ml@unidata.ucar.edu or book an office hours meeting with Thomas on his Calendar.

Comments:

Post a Comment:
Comments are closed for this entry.
News@Unidata
News and information from the Unidata Program Center
News@Unidata
News and information from the Unidata Program Center

Welcome

FAQs

Developers’ blog

Take a poll!

What if we had an ongoing user poll in here?

Browse By Topic
Browse by Topic
« May 2024
SunMonTueWedThuFriSat
   
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
 
       
Today