Folding@home has promoted COVID research through crowdfunding computing during the epidemic. This article explores how to apply this model to deep learning and analyzes the potential and challenges of decentralized training. This article originates from an article written by Jeff Amico, organized and compiled by Shenchao TechFlow. (Preliminary summary: The most powerful AI tutor is here? OpenAI is rumored to release the ChatGPT advanced voice mode on 9/24) (Background supplement: BlackRock joins hands with Microsoft to launch a "US$100 billion" new AI fund, Huang Renxun: NVIDIA fully supports it) In Folding@home achieved a major milestone during the COVID-19 pandemic. The research project achieved 2.4 exaFLOPS of computing power, delivered by 2 million volunteer devices worldwide. This represented fifteen times the processing power of the world's largest supercomputers at the time, allowing scientists to simulate COVID protein dynamics at scale. Their work advanced our understanding of the virus and its pathogenesis, especially early in the epidemic. Global distribution of Folding@home users, 2021 Crowd-funded computing resources to solve problems Folding@home builds on a long history of volunteer computing, projects to solve large-scale problems through crowd-funded computing resources. The idea gained widespread attention in the 1990s with SETI@home, a project that has brought together more than 5 million volunteer computers in the search for extraterrestrial life. The idea has since been applied to a variety of fields, including astrophysics, molecular biology, mathematics, cryptography and gaming. In each case, the collective strength enhanced the capabilities of the individual projects well beyond what they could achieve individually. This drives progress and enables research to be conducted in a more open and collaborative manner. Can crowdfunding model be used for deep learning? Many people wonder if we can apply this crowdfunding model to deep learning. In other words, can we train a large neural network on the masses? Front-end model training is one of the most computationally intensive tasks in human history. As with many @home projects, the current costs are beyond the reach of only the largest players. This could hinder future progress as we rely on fewer and fewer companies to find new breakthroughs.This also concentrates control of our AI systems in the hands of a few. No matter how you feel about the technology, this is a future worth watching. Most critics dismiss the idea of ​​decentralized training as incompatible with current training technology. However, this view is increasingly outdated. New technologies have emerged that reduce the need for communication between nodes, allowing efficient training on devices with poor network connectivity. These technologies include DiLoCo, SWARM Parallelism, lo-fi, and decentralized training of base models in heterogeneous environments. Many of them are fault-tolerant and support heterogeneous computing. There are also new architectures designed specifically for decentralized networks, including DiPaCo and the decentralized hybrid expert model. We're also seeing a variety of cryptographic primitives begin to mature, enabling networks to coordinate resources on a global scale. These technologies support application scenarios such as digital currency, cross-border payments, and prediction markets. Unlike earlier volunteer projects, these networks can aggregate staggering amounts of computing power, often orders of magnitude larger than the largest cloud training clusters currently imagined. Together, these elements form a new model training regularization. This formalization takes advantage of the world's computing resources, including the vast number of edge devices that can be used if wired together. This will reduce the cost of most training workloads by introducing new competition mechanisms. It can also unlock new forms of training, making model development collaborative and modular rather than siled and monolithic. Models can obtain calculations and data from the public and learn on the fly. Individuals can own parts of the models they build. Researchers can also re-share novel findings publicly without having to monetize their findings to cover high computing budgets. This report examines the current state of large model training and associated costs. It reviews previous decentralized computing efforts—from SETI to Folding to BOINC—for inspiration in exploring alternative paths. The report discusses the historical challenges of decentralized training and turns to recent breakthroughs that may help overcome these challenges. Finally, it summarizes future opportunities and challenges. The Current Situation of Front-end Model Training The cost of front-end model training has become unaffordable for non-large players.This trend is not new, but in reality the situation is becoming more serious as front-end labs continue to challenge extension suite assumptions. OpenAI has reportedly spent more than $3 billion on training this year. Anthropic predicts that by 2025, we will start training $10 billion, and $100 billion models are not too far away. This trend leads to industry concentration as only a few companies can afford to participate. This raises a core policy question for the future – can we accept a situation where all leading AI systems are controlled by one or two companies? This also limits the rate of progress, as is evident in the research community, as smaller labs cannot afford the computing resources required to expand suites of experiments. Industry leaders have mentioned this multiple times as well: Meta’s Joe Spisak: To truly understand the capabilities of a [model] architecture, you have to explore it at scale, and I think that’s what’s missing in the current ecosystem. If you look at academia -- there's a lot of brilliant people in academia, but they lack access to computing resources, and that becomes a problem because they have these great ideas but don't really implement them at the required level. way. Max Ryabinin, Together: The need for expensive hardware puts a lot of pressure on the research community. Most researchers are unable to participate in large-scale neural network development because it is cost-prohibitive for them to conduct the necessary experiments. If we continue to increase the size of the model by scaling it up, we will eventually be able to develop it. Francois Chollet, Google: We know that large language models (LLMs) have not yet achieved artificial general intelligence (AGI). Meanwhile, progress toward AGI has stalled. The limitations we face with large language models are exactly the same limitations we faced five years ago. We need new ideas and breakthroughs. I think the next breakthrough is likely to come from outside teams while all the big labs are busy training bigger big language models. Some are skeptical of these concerns, arguing that hardware improvements and cloud computing capital expenditures will solve the problem. But this seems unrealistic. For one thing, by the end of this decade, new generations of Nvidia chips will have significantly more FLOPs, perhaps 10 times as many as today's H100s.This will reduce the price per FLOP by 80-90%. Likewise, total FLOP supply is expected to increase approximately 20-fold over the next decade, along with improvements in network and related infrastructure. All of this will increase training efficiency per dollar. Source: SemiAnalysis AI Cloud TCO Model Meanwhile, total FLOP…