Aussie AI
Scaling Laws in Generative AI
-
Last Updated 7 December, 2024
-
by David Spuler, Ph.D.
Research on Scaling Laws
- Sotiris Anagnostidis, Gregor Bachmann, Imanol Schlag, Thomas Hofmann, 2024, Navigating Scaling Laws: Compute Optimality in Adaptive Model Training https://openreview.net/pdf?id=3KxPo62PYn (Evaluates some model properties, such as width, on vision Transformers from the point of view of the scaling laws.)
- Jared Kaplan, Sam McCandlish, Tom Henighan, Tom B. Brown, Benjamin Chess, Rewon Child, Scott Gray, Alec Radford, Jeffrey Wu, and Dario Amodei. Scaling laws for neural language models. CoRR, abs/2001.08361, 2020. https://arxiv.org/abs/2001.08361
- Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, Tom Hennigan, Eric Noland, Katie Millican, George van den Driessche, Bogdan Damoc, Aurelia Guy, Simon Osindero, Karen Simonyan, Erich Elsen, Jack W. Rae, Oriol Vinyals, and Laurent Sifre. Training compute-optimal large language models. CoRR, abs/2203.15556, 2022. doi: 10.48550/arXiv.2203.15556. https://arxiv.org/abs/2203.15556
- Aidan Clark, Diego de las Casas, Aurelia Guy, Arthur Mensch, Michela Paganini, Jordan Hoffmann, Bogdan Damoc, Blake Hechtman, Trevor Cai, Sebastian Borgeaud, George van den Driessche, Eliza Rutherford, Tom Hennigan, Matthew Johnson, Katie Millican, Albin Cassirer, Chris Jones, Elena Buchatskaya, David Budden, Laurent Sifre, Simon Osindero, Oriol Vinyals, Jack Rae, Erich Elsen, Koray Kavukcuoglu, Karen Simonyan, 9 Feb 2022 (v2), Unified Scaling Laws for Routed Language Models, https://arxiv.org/abs/2202.01169
- Benj Edwards, 16 July, 2024, Microsoft CTO Kevin Scott thinks LLM “scaling laws” will hold despite criticism, https://arstechnica.com/information-technology/2024/07/microsoft-cto-defies-critics-ai-progress-not-slowing-down-its-just-warming-up/
- Nandu Anilal, July 16, 2024, Infrastructure after AI Scaling: Why AI scaling won't last forever (and what comes next) https://nandu.substack.com/p/infrastructure-after-ai-scaling
- Chaofan Tao, Qian Liu, Longxu Dou, Niklas Muennighoff, Zhongwei Wan, Ping Luo, Min Lin, Ngai Wong, 18 Jul 2024, Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies, https://arxiv.org/abs/2407.13623
- 18 Apr 2024 (v2), The Efficiency Spectrum of Large Language Models: An Algorithmic Survey, Tianyu Ding, Tianyi Chen, Haidong Zhu, Jiachen Jiang, Yiqi Zhong, Jinxin Zhou, Guangzhi Wang, Zhihui Zhu, Ilya Zharkov, Luming Liang, https://arxiv.org/abs/2312.00678
- Tiernan Ray, July 24, 2024, 3 ways Meta's Llama 3.1 is an advance for Gen AI, https://www.zdnet.com/article/3-ways-metas-llama-3-1-is-an-advance-for-gen-ai/
- Bradley Brown, Jordan Juravsky, Ryan Ehrlich, Ronald Clark, Quoc V. Le, Christopher Ré, Azalia Mirhoseini, 31 Jul 2024, Large Language Monkeys: Scaling Inference Compute with Repeated Sampling, https://arxiv.org/abs/2407.21787 (Generating multiple answers by repeated inference queries, and then using a verifier to choose the best one, which is shown to greatly increase overall accuracy.)
- Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang, 1 Aug 2024, An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models, https://arxiv.org/abs/2408.00724
- Pablo Villalobos, Anson Ho, Jaime Sevilla, Tamay Besiroglu, Lennart Heim, Marius Hobbhahn, Jun 06, 2024, Will We Run Out of Data? Limits of LLM Scaling Based on Human-Generated Data, Epoch AI, https://epochai.org/blog/will-we-run-out-of-data-limits-of-llm-scaling-based-on-human-generated-data
- Anson Ho, Tamay Besiroglu, Ege Erdil, David Owen, Robi Rahman, Zifan Carl Guo, David Atkinson, Neil Thompson, Jaime Sevilla, 9 Mar 2024, Algorithmic progress in language models, https://arxiv.org/abs/2403.05812
- Nathan Lambert, Sep 05, 2024, OpenAI’s Strawberry, LM self-talk, inference scaling laws, and spending more on inference, Whether or not scaling works, we should spend more on inference, https://www.interconnects.ai/p/openai-strawberry-and-inference-scaling-laws
- Ethan Mollick, Sep 16, 2024, Scaling: The State of Play in AI, https://www.oneusefulthing.org/p/scaling-the-state-of-play-in-ai
- Chuhan Wu, Ruiming Tang, 17 September 2024, Towards a Universal Scaling Law of LLM Training and Inference, DOI: 10.14293/PR2199.001074.v1, https://www.scienceopen.com/document_file/b3ff92f8-76a6-42ca-94d2-48693442bf98/ScienceOpenPreprint/Unified_law_arxiv.pdf
- Elias Frantar, September, 2024, Compressing Large Neural Networks Algorithms, Systems and Scaling Laws, Ph.D. Thesis, Graduate School, Institute of Science and Technology, Austria, https://research-explorer.ista.ac.at/download/17485/17880/frantar_thesis_final.pdf
- Akash Bajwa, Oct 07, 2024, Inference Time Scaling Laws: AI Megacycle Of System 1 And System 2 Applications, https://akashbajwa.substack.com/p/inference-time-scaling-laws
- Tanay Jaipuria, Oct 29, 2024, OpenAI's o-1 and inference-time scaling laws, https://www.tanayj.com/p/openais-o-1-and-inference-time-scaling
- Tanishq Kumar, Zachary Ankner, Benjamin F. Spector, Blake Bordelon, Niklas Muennighoff, Mansheej Paul, Cengiz Pehlevan, Christopher Ré, Aditi Raghunathan, 7 Nov 2024, Scaling Laws for Precision, https://arxiv.org/abs/2411.04330
- Krystal Hu and Anna Tong, November 15, 2024, OpenAI and others seek new path to smarter AI as current methods hit limitations, https://www.reuters.com/technology/artificial-intelligence/openai-rivals-seek-new-path-smarter-ai-current-methods-hit-limitations-2024-11-11/
- Bo Chen, Xiaoyu Li, Yingyu Liang, Jiangxuan Long, Zhenmei Shi, Zhao Song, 12 Nov 2024, Circuit Complexity Bounds for RoPE-based Transformer Architecture, https://arxiv.org/abs/2411.07602
- Xu Ouyang, Tao Ge, Thomas Hartvigsen, Zhisong Zhang, Haitao Mi, Dong Yu, 27 Nov 2024 (v2), Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens, https://arxiv.org/abs/2411.17691
- Gary Marcus, Nov 25, 2024, A new AI scaling law shell game? Scaling laws ain’t what they used to be, https://garymarcus.substack.com/p/a-new-ai-scaling-law-shell-game
- Gary Grossman, Edelman, December 1, 2024, The end of AI scaling may not be nigh: Here’s what’s next, https://venturebeat.com/ai/the-end-of-ai-scaling-may-not-be-nigh-heres-whats-next/
More AI Research
Read more about: