
Why Big Tech Wants to Make AI Cost Nothing
๐ Abstract
The article discusses Meta's open-sourcing and release of the Llama 3.1 large language model (LLM), which is competitive with OpenAI's ChatGPT and Anthropic's Claude. It explores the potential reasons behind Meta's decision to release the model for free, including the "commoditize your complement" business strategy, the challenges of training such large models, and the implications for smaller AI startups. The article also touches on the broader trends in the AI infrastructure build-out and its potential impact on various industries.
๐ Q&A
[01] Meta's Release of Llama 3.1
1. What are the key reasons behind Meta's decision to open-source and release Llama 3.1 for free?
- The article suggests that Meta may be employing the "commoditize your complement" business strategy, where by making the LLM freely available, it can increase demand for the complementary products and services, such as server and GPU rentals from cloud providers.
- Another potential reason is to enable more user-generated content on Meta's platforms, which can then be monetized through advertising.
- The article also suggests that Meta may not see value in having a second-place general-purpose LLM, and that open-sourcing it could help establish it as a standard.
2. How does the scale of Meta's AI infrastructure compare to other companies?
- According to the article, by the end of 2024, Meta will have the equivalent of 600,000 H100 GPUs, which could enable them to train 75 GPT-4 scale models every 90 days or about 300 such models every year.
- This scale of infrastructure dwarfs what other companies like OpenAI and Anthropic currently have, potentially allowing Meta to release even larger and more capable models in the future.
[02] Impact on Smaller AI Startups
1. How might the open-sourcing of large language models by tech giants like Meta impact smaller AI startups?
- The article suggests that the "big losers" in the commoditization of LLMs may ultimately be the current "hot and disruptive AI startups" like OpenAI, Anthropic, Character.ai, Cohere, and Mistral.
- When the largest tech companies start giving away their main product (LLMs) for free, it could pose a significant challenge for these smaller startups to compete.
2. Is there still a chance for smaller companies to outflank the tech giants in the race towards artificial general intelligence (AGI) or artificial superintelligence (ASI)?
- The article suggests that if these smaller companies have some modeling or R&D edge that doesn't simply involve having a massive number of GPUs, then perhaps there is still a chance they can outflank the tech giants.
- The article notes that OpenAI started with fundamental R&D in areas like DOTA2 bots, robotics, and reinforcement learning, and the original GPT model was a side-project. So these LLMs may be a distraction from the fundamental research that could lead to more capable models and avenues of research.
[03] Broader Implications of AI Infrastructure Build-out
1. How does the current AI infrastructure build-out compare to the infrastructure build-out preceding the dot-com bubble?
- The article draws a parallel between the current AI infrastructure build-out and the infrastructure build-out that preceded the dot-com bubble in the early 2000s.
- Just as the laying of fiber-optic cable and broadband infrastructure paved the way for Web 2.0 companies like Facebook and Google, the current AI infrastructure build-out may enable breakthroughs in other areas such as robotics, autonomous vehicles, and drug development.
2. What are the potential implications of the rapid growth in AI infrastructure and model scaling?
- The article suggests that the sheer scale of the current AI infrastructure build-out, with companies like Meta having the equivalent of 600,000 H100 GPUs, gives hope for future breakthroughs.
- However, the article also raises the question of whether the current path of scaling ever larger multimodal transformer models will ultimately lead to artificial general intelligence (AGI) or even artificial superintelligence (ASI).