According to a report by TechCrunch on January 9, Elon Musk stated during a live conversation with Stagwell Chairman Mark Penn that the training of current AI models has largely exhausted real-world data. "We have exhausted the cumulative sum of human knowledge, and this happened last year." Musk's views align with those of former OpenAI Chief Scientist Ilya Sutskever, who suggested at the NeurIPS machine learning conference that the AI industry has reached a "data peak," and that the way models are developed may need to change in the future. Musk believes that synthetic data will be a way to supplement real data, and AI will achieve self-learning through the generation and self-assessment of data. This trend has been adopted by technology giants including Microsoft, Meta, OpenAI, and Anthropic, with models like Microsoft's Phi-4 and Google's Gemma combining real and synthetic data for training. Gartner predicts that about 60% of data in AI and analytics projects will be synthetic generated by 2024. The advantages of synthetic data include cost savings; for example, the AI startup Writer spent only about $700,000 to develop its Palmyra X 004 model, which is almost entirely based on synthetic data, compared to approximately $4.6 million for developing a model of similar scale at OpenAI. However, synthetic data also carries risks, including a decrease in model creativity, exacerbated output bias, and potential model collapse, especially when the training data itself is biased, which can also affect the generated results.