GitHub - mryab/efficient-dl-systems: Efficient Deep Learning Systems course materials (HSE, YSDA)

2 min read Original article ↗

Efficient Deep Learning Systems

This repository contains materials for the Efficient Deep Learning Systems course, taught at the Faculty of Computer Science of HSE University and Yandex School of Data Analysis.

This branch corresponds to the 2026 iteration of the course. If you want to see full materials of past years, see the "Past versions" section.

Syllabus

  • Week 1: Introduction
    • Lecture: Course overview and logistics. Core concepts of the GPU architecture and CUDA API.
    • Seminar: CUDA operations in PyTorch. Introduction to benchmarking.
  • Week 2: General training optimizations, profiling deep learning code
  • Week 3: Data-parallel training and All-Reduce
  • Week 4: Methods for training large models
  • Week 5: Sharded data-parallel training, distributed training optimizations
  • Week 6: Deep learning performance from first principles
  • Week 7: Basics of web service deployment
  • Week 8: Systems optimizations for inference
  • Week 9: Algorithmic optimizations for inference
  • Week 10: Guest lecture

Grading

There will be several home assignments (spread over multiple weeks) on the following topics:

  • Training pipelines and code profiling
  • Distributed and memory-efficient training
  • Deploying and optimizing models for production

The final grade is a weighted sum of per-assignment grades. Please refer to the course page of your institution for details.

Staff

Past versions