Fundamentals of Accelerated Computing with CUDA Python
(NFACWCP-OD)
This course explores how to use Numba—the just-in-time, type-specializing Python function compiler—to accelerate Python programs to run on massively parallel NVIDIA GPUs. You’ll learn how to:
- Use Numba to compile CUDA kernels from NumPy universal functions (ufuncs).
- Use Numba to create and launch custom CUDA kernels.
- Apply key GPU memory management techniques.
Upon completion, you’ll be able to use Numba to compile and launch CUDA kernels to accelerate your Python applications on NVIDIA GPUs.
Prerequisities
- Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations.
- NumPy competency, including the use of ndarrays and ufuncs.
- No previous knowledge of CUDA programming is required.
Suggested Resources to Satisfy Prerequisites
Tools, Libraries, and Frameworks Used
- Numba