Name: (Virtual) CUTLASS: A CUDA C++ Template Library for Accelerating Deep Learning Computations - Aniket Shivam & Vijay Thakkar, NVIDIA
Start: 2023-05-01T07:00:00-0700
End: 2023-05-01T07:40:00-0700

May 10-12, 2023
Vancouver, British Columbia, Canada + Virtual
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for Open Source Summit North America 2023 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in Pacific Daylight Time (UTC/GMT -8). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

IMPORTANT NOTE: Timing of sessions and room locations are subject to change.

Back To Schedule

(Virtual) CUTLASS: A CUDA C++ Template Library for Accelerating Deep Learning Computations - Aniket Shivam & Vijay Thakkar, NVIDIA

Feedback form is now closed.

At the core of Machine and Deep Learning lie different flavors of linear algebra computations like matrix multiply and convolutions. In the last decade, GPU computing solutions from NVIDIA have accelerated AI compute, with an overall gain of 50X to 200X via architectural innovations. While this has helped applications like ChatGPT and Github Copilot to become a reality, the developers have to learn to optimally utilize and customize GPU compute for their applications. In this talk we present CUTLASS, an open-source header-only CUDA C++ template library that has been helping programmers, since 2017, in implementing high-performance CUDA kernels across various generations of NVIDIA's GPU architectures. CUTLASS, which contains, optimized, production quality implementations of AI computations has been the go-to source for Tensor Core programming details. CUTLASS provides modular abstractions and building blocks to CUDA programmers who are eager to write their own CUDA C++ kernels to perform deep learning computations such as matrix multiplication, convolutions, etc. We expect audience members to gain actionable knowledge and insights about Tensor Core programming and in developing custom CUDA C++ kernels using CUTLASS that push the limits of performance on NVIDIA GPUs.

Speakers

Aniket Shivam

Deep Learning Library Engineer, NVIDIA

I am currently working as a Deep Learning Library Engineer at NVIDIA. My work focuses on implementation and optimization of Math and Deep Learning libraries such as CUTLASS and others. I graduated with a PhD in Computer Science from University of California, Irvine (UCI). My research... Read More →

Vijay Thakkar

Compute Architect, NVIDIA

Currently, I work at NVIDIA full time while I finish my PhD. At NVIDIA, I collaborate closely with Cris Cecka from NVR to lead the design of next generation linear algebra libraries, namely, CUTLASS 3.0, a project I have been working on since its inception. I also work on the exposure... Read More →

Monday May 1, 2023 7:00am - 7:40am PDT
Virtual

Open AI & Data Forum, Machine and Deep Learning

Audience Level Intermediate
Virtual Yes

Open Source Summit North America 2023

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Aniket Shivam

Vijay Thakkar