Technology
Chinchilla
DeepMind's 70B-parameter LLM: compute-optimal scaling, outperforming larger models like GPT-3 and Gopher with 4x the training data.
Chinchilla is DeepMind's 70-billion-parameter large language model (LLM), introduced in 2022 to redefine scaling laws (Source: *Training Compute-Optimal Large Language Models*). It challenges the 'bigger is better' trend: using the same compute budget as the 280B-parameter Gopher, Chinchilla achieves superior performance by training on 1.4 trillion tokens, four times more data. This compute-optimal approach yields an average accuracy of 67.5% on the MMLU benchmark and drastically cuts inference and fine-tuning costs: a clear win for efficiency.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1