OpenAI, the tech startup known for developing the cutting-edge natural language processing algorithm ChatGPT, has warned that the research strategy that led to the development of the AI model has reached its limits. The OpenAI’s CEO, Sam Altman, said during an event held at the Massachusetts Institute of Technology that future advances would no longer come from making models bigger. OpenAI’s ChatGPT has garnered much interest and investment in AI since its launch in November 2022. However, Altman believes that progress on transformers, the machine learning model behind ChatGPT, lies beyond scaling.
Also Read: OpenAI Co-Founder & Chief Data Scientist On the Potential of AGI
The End of the Era of Giant AI Models
According to Altman, OpenAI’s successful strategy of scaling existing machine-learning algorithms up to previously unimagined sizes has reached its limits. He believes that further progress in AI will come from finding new ways to improve the algorithms rather than making them more prominent. Altman suggested that the days of “giant, giant models” are over. This approach, where the models become increasingly more massive, and the data inputs become more abundant, has helped fuel AI’s rapid growth over the past decade. Altman’s comments suggest that researchers may need to look in new directions to achieve future advances.
The Costs of Scaling AI Models
One primary reason behind the pivot away from “scaling is all you need” is the high cost of training and running the robust graphics processes required for large language models (LLMs). Altman noted that the training process for ChatGPT needed over $100 million and reportedly involved more than 10,000 GPUs. However, GPUs are not cheap. The latest H100 GPUs from Nvidia, specifically designed for AI and high-performance computing (HPC), can cost as much as $30,603 per unit.
As the costs of developing and running large AI models have continued to increase, the economics of scale have turned against the “bigger is better” approach. Altman believes that progress will come from improving model architectures, enhancing data efficiency, and advancing algorithmic techniques beyond copy-paste scaling. Although this is a significant shift, researchers like Nick Frosst, co-founder at Cohere, agree that progress beyond scaling is necessary. Frosst says that new AI model designs, architectures, and further tuning based on human feedback are promising directions that many researchers are already exploring.
Elon Musk’s Latest GPU Purchase
Despite the shift away from scaling AI models, access to GPUs remains crucial for training and running large AI models. A recent Twitter Spaces interview with Elon Musk revealed that his companies, Tesla and Twitter, were buying thousands of GPUs to develop a new AI company. Musk confirmed that he was purchasing the GPUs but noted that availability might be an issue. Even for major cloud providers like Microsoft, Google, and Amazon, it can take months to reserve access to GPUs.
Also Read: Elon Musk’s AI Paradox: Investing in AI Research After Calling for Pause
Our Say
While OpenAI’s strategy of making models more significant has been successful, CEO Sam Altman believes that further progress in AI will come from finding new ways to improve algorithms beyond scaling. Researchers and companies that develop AI models will need to explore new avenues for improvement in the field. Despite the shift away from scaling, access to GPUs remains crucial for developing and running AI models. However, the high costs associated with GPUs may limit access to this essential resource, forcing researchers to look for more affordable alternatives.