Accelerating End-to-End Data Science Workflows (NAEEDSW-OD)

Whether you work at a software company that needs to improve customer retention, a financial services company that needs to mitigate risk, or a retail company interested in predicting customer purchasing behavior, your organization is tasked with preparing, managing, and gleaning insights from large volumes of data without wasting critical resources. Traditional CPU-driven data science workflows can be cumbersome, but with the power of GPUs, your teams can make sense of data quickly to drive business decisions.


In this Deep Learning Institute (DLI) course, developers will learn how to build and execute end-to-end GPU accelerated data science workflows that enable them to quickly explore, iterate, and get their work into production. Using the RAPIDS accelerated data science libraries, developers will apply a wide variety of GPU-accelerated machine learning algorithms, including XGBoost, cuGRAPH’s single-source shortest path, and cuML’s KNN, DBSCAN, and logistic regression to perform data analysis at scale.


All enrolled students get access to fully configured, GPU-accelerated servers in the cloud.


Learning Objectives

By participating in this workshop, you’ll learn how to:

  • Implement GPU-accelerated data preparation and feature extraction using cuDF and Apache Arrow data frames.
  • Apply a broad spectrum of GPU-accelerated machine learning tasks using XGBoost and a variety of cuML algorithms.
  • Execute GPU-accelerated graph analysis with cuGraph, achieving massive-scale analytics in small amounts of time.
  • Build beautiful data visualizations with the GPU-accelerated cuXFilter.


Prerequisities

  • Experience with the Python programming language.
  • Familiarity with the use of Pandas dataframes.
  • Familiarity with the use of the scikit-learn machine learning library.


Suggested Resources to Satisfy Prerequisites


Tools, Libraries, and Frameworks Used

  • RAPIDS
  • cuDF
  • cuML
  • cuGraph
  • Apache Arrow