22 May 2024

Efficient Neural Networks

Update Oct 2024

UC Berkeley has a AI systems, taught in various versions by Ion Stoica, Joseph Gonzalez, Matei Zaharia (what a cast 😲). It’s basically just a paper reading course that is updated each iteration. I highly recommend starting there - they do a much better job than me.

Disclaimer

There are better resources than this if you are trying to learn more about efficient neural networks.

https://github.com/AIoT-MLSys-Lab/Efficient-LLMs-Survey (more than just llms)
https://mlsys.stanford.edu/cs528/
https://pll.harvard.edu/course/fundamentals-tinyml and https://tinyml.seas.harvard.edu/
Matei Zaharia’s AI-Systems Course
https://github.com/mosharaf/eecs598/tree/w24-genai and https://github.com/mosharaf/eecs598/tree/w21-ai
https://sites.google.com/view/efficientml/home?authuser=0
https://hanlab.mit.edu/courses/2023-fall-65940
https://dlsyscourse.org/
https://www.fast.ai/ (meh, but has some okay content)

Papers

The idea is to log some classical papers and some papers that are newer for each category.

I aim to keep updating this list as I read more.

Quantization and Dynamic Quantization methods
Pruning of neurons and bits
Knowledge Distillation
Federated Learning
Matrix Operation Optimization (Algorithmic and Hardware)
- Improving the speed of neural networks on CPUs
- Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models and Monarch: Expressive Structured Matrices for Efficient and Accurate Training
- Implementing block-sparse matrix multiplication kernels using Triton
- Sputnik (not a paper but super cool library to reimplement)
- Efficient Block Approximate Matrix Multiplication and Fixed-sparsity matrix approximation from matrix-vector products
Model Compression
- DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks

M-% || M-x query-replace

Efficient Neural Networks

Update Oct 2024

Disclaimer

Papers