High Performance Parallel Programming
This course looks at parallel programming models, efficient programming methodologies and performance tools with the objective of developing highly efficient parallel programs.
The course consists of a set of lectures and laboratory sessions.
The lectures start with an overview of parallel computer architectures and parallel programming models and paradigms. An important part of the discussion are mechanisms for synchronization and data exchange. Next, performance analysis of parallel programs is covered.
The course proceeds with a discussion of tools and techniques for developing parallel programs in shared address spaces. This section covers popular programming environments such as pthreads and OpenMP.
Next the course discusses the development of parallel programs for distributed address space. The focus in this part is on the Message Passing Interface (MPI). Finally, we discuss programming approaches for executing applications on accelerators such as GPUs. This part introduces the CUDA (Compute Unified Device Architecture) programming environment.
The lectures are complemented with a set of laboratory sessions in which participants explore the topics introduced in the lectures. During the lab sessions, participants parallelize sample programs over a variety of parallel architectures, and use performance analysis tools to detect and remove bottlenecks in the parallel implementations of the programs.