Home > Brands > NVIDIA > Accelerated Computing

Scaling Workloads Across Multiple GPUs with CUDA C++ (NSWAMGC-OD)

Writing CUDA C++ applications that efficiently and correctly utilize all available GPUs on a node drastically improves performance over single-GPU code, and makes the most cost-effective use out of compute nodes with multiple GPUs. In this workshop you will learn to utilize multiple GPUs on a single node by:

Learning how to launch kernels on multiple GPUs, each working on a subsection of the required work
Learning how to use concurrent CUDA Streams to overlap memory copy with computation on multiple GPUs

Upon completion, you will be able to build robust and efficient CUDA C++ applications that can leverage all available GPUs on a single node."

Prerequisities

Professional experience programming CUDA C/C++ applications, including the use of the nvcc compiler, kernel launches, grid-stride loops, host-to-device and device-to-host memory transfers, CUDA Streams, copy/compute overlap, and CUDA error handling.
Familiarity with the Linux command line.
Experience using Makefiles to compile C/C++ code

Suggested Resources to Satisfy Prerequisites

Fundamentals of Accelerated Computing with CUDA C/C++.
Accelerating CUDA C++ Applications with Concurrent Streams.
Ubuntu Command Line for Beginners (sections 1 through 5).
Makefile Tutorial (through Simple Examples).

Tools, Libraries, and Frameworks Used

CUDA C++
nvcc
Nsight Systems

Download With Schedule Download Without Schedule

Pianifica

* Il prezzo indicato non include l’IVA che sarà però applicata in fattura

4.00 Hours

Registrazione Request a course / private training