Build a ChatGPT-like LLM in PyTorch from the ground up, step by step
Collaborative speedrun to train a 124M GPT-2 model in under 90 seconds on 8xH100s