Skip to main content

Deep Learning

These notes follow Aston Zhang, Zachary C. Lipton, Mu Li, and Alexander J. Smola's Dive into Deep Learning and emphasize the book's central style: concepts, mathematics, and runnable code together. The path starts with tensors, data preparation, linear algebra, calculus, probability, and automatic differentiation, then builds complete training loops for regression and classification before moving to modern architectures.

The later pages cover the main deep learning families: multilayer perceptrons, convolutional networks, recurrent networks, attention, transformers, NLP applications, computer vision systems, recommender systems, GANs, reinforcement learning, Gaussian processes, and hyperparameter optimization. Code examples use PyTorch for portability. For classical context, compare these notes with machine learning; for prerequisites, see linear algebra and probability.

  1. Tensors and Data Preprocessing
  2. Math for Deep Learning
  3. Linear Regression and Training Loops
  4. Softmax Classification and Generalization
  5. Multilayer Perceptrons and Regularization
  6. PyTorch Builders Guide
  7. Convolutional Neural Networks
  8. Modern CNNs
  9. Sequence Modeling and RNNs
  10. Gated RNNs and Sequence-to-Sequence
  11. Attention and Transformers
  12. Pretrained Transformers and BERT
  13. Optimization Algorithms
  14. Computational Performance
  15. Computer Vision Applications
  16. NLP Pretraining and Applications
  17. Generative Adversarial Networks
  18. Recommender Systems
  19. Reinforcement Learning and Bayesian Tuning