Guest lecture by Cosmin Oancea on "Demonstrating Locality of Reference on Multi-Core CPU and GPU"

On Wednesday we will have a guest lecture by Cosmin Oancea, associate professor from University of Copenhagen on

Demonstrating Locality of Reference on Multi-Core CPU and GPU

As title indicates, the main goal of this lecture is to demonstrate several ``simple'' techniques for optimizing locality of reference in the context of two different hardware: multi-core CPU and general-purpose graphical processing units (GPU).

To this extent, we will (i) briefly survey the key design ideas that differentiate GPUs from multicores, (ii) introduce two parallel programming models, OpenMP and Cuda, aimed at multi-core CPU and GPU execution, respectivelly, and (iii) introduce five case studies that demonstrate techniques for optimizing temporal and spatial locality. 

The lecture is intended to present the high-level rationale and key ideas used to optimize locality in the five cases, to provide a road map of the optimization recipes, and to practically demonstrate the impact of these optimizations.  For those interested, theory can be put in practice (in your own time) by solving four fill-in-the-blanks Cuda exercises. These require implementing key parts of the code according to instructions--see lecture slides and github repo--and are thought to enable easy digestion of Cuda by pattern matching existing code, and to provide (hopefully) the satisfying experience of seeing your code validate and achiving important performance gains. 


 

Published Feb. 13, 2023 2:28 PM - Last modified Feb. 13, 2023 2:28 PM