GPU Programming with OpenMP
Lawrence Meadows and Tim Mattson (Intel)
OpenMP 1.0 was released in 1997 when the primary concern was symmetric multiprocessors. Over time, hardware has evolved with more complex memory hierarchies forcing us to embrace NUMA machines and the vector units on a CPU leading to SIMD constructs. Current trends in hardware bring co-processors such as GPUs into the fold. A modern platform is often a heterogeneous system with CPU cores, GPU cores, and eventually other specialized accelerators. OpenMP has responded by adding directives that map code and data onto a device. We refer to this family of directives as the target directives.
In this tutorial, we will explore these directives as they apply to programming GPUs. We will very briefly review the fundamentals of OpenMP and then move directly to the target directives and their use in complex application programs. We will run the tutorial in what we call “demo mode” where we together write code to explore OpenMP. In addition to using OpenMP to write programs, we will also explore how the technology works; i.e., a look under the hood to understand how the OpenMP compiler and runtime system work together to map code onto a GPU.