免费领取大会全套演讲PPT    

点击领取

我要参会

Bryce Adelstein Lelbach

Principal Engineer at NVIDIA and Lead of the CUDA C++ Core Libraries

Bryce Adelstein Lelbach has spent over a decade developing programming languages, compilers, and libraries. He is passionate about parallel programming and strives to make it more accessible for everyone. Bryce is a Principal Engineer at NVIDIA, where he founded the Core C++ Compute Libraries team and now leads the Vanguard Programming group that drives NVIDIA's roadmap for programming languages, compilers, and core libraries. He is a leader of the systems programming language community, having served as chair of the C++ Library Evolution and the US programming language standards committee. He has been an organizer and program chair for many conferences over the years. On the C++ committee, he has worked on concurrency primitives, parallel algorithms, senders, and multidimensional arrays. He previously worked at Lawrence Berkeley National Laboratory and Louisiana State University. He is one of the founding developers of the HPX parallel runtime system. Outside of work, Bryce is passionate about airplanes and watches. He lives in Midtown Manhattan with his girlfriend and dog.

Topic

GPU Programming with Tiling

Parallel programming can be intimidating, but doesn't need to be! There's a new paradigm for parallel programming that's newcomer-friendly, highly productive, and performant: tile-based programming models. In this example-driven talk, we'll introduce you to tile-based programming in Python, C++, and Rust. We'll present [cuTile](https://github.com/NVIDIA/cutile-python), NVIDIA's new tile programming stack and [Tile IR](https://github.com/NVIDIA/cuda-tile), the new compiler stack that it is built with. You'll learn all about new features of CUDA Tile that have recently been announced, including multi-GPU communication, interoperability with traditional CUDA SIMT, and support for more diverse kernels like convolutions and stencils. We'll compare and contrast tile-based models with traditional parallel programming models. You'll see examples from a variety of domains, including HPC stencils, a sparse matrix vector (SPMV) and conjugate gradient (CG) solver, and AI models from [TileGym](https://github.com/NVIDIA/TileGym). Tile programming has its roots in HPC libraries, such as [NWChem’s TCE](https://nwchemgit.github.io/TCE.html), [BLIS](https://github.com/flame/blis), and [ATLAS](https://math-atlas.sourceforge.net/). In recent years, this paradigm has grown in popularity for GPU programming in languages such as [Triton](https://openai.com/index/triton/), [JAX/Pallas](https://docs.jax.dev/en/latest/pallas/index.html), and [Warp](https://nvidia.github.io/warp/modules/tiles.html). In this session, you'll: - Learn the best practices for writing tile parallel applications for GPUs. - Gain insight into the performance of tile code and how it actually gets executed. - Discover how to reason about and debug tile applications. - Understand the differences between tile and traditional parallel programming and when each paradigm should be used. - See how tile programming makes your software portable in light of recent hardware trends. By the end of the session, you'll understand how tile programming enables more intuitive, portable, and efficient development of high-performance, data-parallel applications, for HPC, data science, and machine learning.

© boolan.com 博览 版权所有

沪ICP备15014563号-6

沪公网安备31011502003949号