- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
By mid-2024, artificial intelligence large language models (LLMs) were running into diminishing returns to scale in training data and computational capacity. LLM training began to shift away from costly pre-training to cheaper fine-tuning and allowing LLMs to ‘reason’ for longer before replying to questions.
Fine-tuning uses chain-of-thought (CoT) training data that includes questions and the logical steps to reach correct answers. This increases the efficiency of learning for smaller AI models, such as DeepSeek. CoT data can be extracted from large ‘teacher’ LLMs to train small ‘student’ models.
These changes shift the cost structure of AI models from high pre-training costs to lower fine-tuning costs for model developers and more inference costs for users. While smaller models are cheaper to use, a positive AI demand effect is likely to exceed the negative price effect. Price competition between models will increase, resulting in tighter margins for AI firms. Specialised models can still fetch premium prices.
Cheaper LLMs are an opportunity for European Union companies to catch up in building smaller AI models and applications on top of LLMs. Increased demand for AI services will require more investment in computing infrastructure, including in the EU. Investing in large LLMs and the corresponding hyperscale infrastructure is riskier, especially as price competition between models increases.
Knowledge extraction between AI models puts pressure on model developers to protect their investments against free-riding by others. It also creates a dilemma for policymakers: should they favour free-riding to promote competition and innovation, or should they clamp down and reinforce monopoly rents to stimulate investment in AI models? Past policy will not be an appropriate response in a world that offers vastly expanded opportunities for knowledge pooling and innovation at lower cost.
The paper contains errors about AI technology and should not be taken at face value. Notably, its understanding of distillation is wrong.
Unfortunately, it also lacks an analysis of EU law, which makes the paper rather useless. The EU almost always goes for monopoly rents in these matters, which does not stimulate anything. The EU has no content industry able to compete with the US’s, no major tech industry, and it is clearly not developing AI companies either.