Microsoft CTO Kevin Scott thinks LLM “scaling laws” will hold despite criticism
🌈 Abstract
The article discusses the debate around the progress of large language models (LLMs) and the concept of "scaling laws" that suggest continued improvements in AI capabilities through scaling up model size, training data, and computational power. It contrasts the views of Microsoft CTO Kevin Scott, who believes scaling laws will drive further AI progress, with the perception among some critics that progress has plateaued.
🙋 Q&A
[01] Microsoft CTO's Perspective on LLM Progress
1. What is Microsoft CTO Kevin Scott's view on the continued progress of LLMs?
- Kevin Scott believes that "scaling laws" will continue to drive progress in AI, despite some skepticism in the field that progress has leveled out.
- He argues that there is an "exponential" in the scaling of LLMs, and that each new generation of models will show improvements, particularly in areas where current models struggle.
- Scott acknowledges the challenge of infrequent data points, as new models often take years to develop, but expresses confidence that future iterations will be better in terms of cost, fragility, and capabilities.
2. What is Scott's perspective on the perception of a plateau in LLM progress?
- Scott pushes back against the idea that AI progress has stalled, suggesting that tech giants like Microsoft still feel justified in investing heavily in larger AI models, betting on continued breakthroughs rather than hitting a capability plateau.
- He suggests that the perception of a plateau may be due to the rapid onset of AI in the public eye, when in fact LLMs have been developing for years prior to the recent high-profile releases like GPT-4.
[02] Critiques of Continued LLM Progress
1. What are some of the criticisms and skepticism around the continued progress of LLMs?
- Some critics in the AI community, such as Gary Marcus, have argued that progress in LLMs has plateaued around GPT-4 class models, with recent models like Google's Gemini 1.5 Pro and Anthropic's Claude Opus not showing the dramatic leaps in capability seen in earlier generations.
- The perception of a plateau has been fueled by informal observations and some benchmark results, leading to the belief that LLM development may be approaching diminishing returns.
2. How do critics respond to the argument that tech companies have "something we don't know about" in terms of continued LLM progress?
- Critic Ed Zitron argues that the defense of continued investment into generative AI, based on the idea that OpenAI or other companies have a "big, sexy, secret technology" that will break the "bones of every hater," is not convincing.
- Zitron suggests that this is not the case and that there is no evidence to support the claim that these companies have a secret technology that will drive eternal breakthroughs in LLM capabilities.