Author: Teng Yan, Chain of Thought; Translation: Jinse Finance Xiaozou
I have a significant regret that still haunts me; it is undoubtedly the most obvious investment opportunity for anyone paying attention, but I did not invest a penny. No, I’m not talking about the next Solana killer or a meme coin with a funny hat.
But rather... NVIDIA.
In just one year, NVIDIA’s market capitalization soared from $1 trillion to $3 trillion, tripling and even surpassing Bitcoin during the same period.
Of course, the hype around artificial intelligence cannot be overlooked, but a significant part of it is based on reality. NVIDIA announced its revenue for the fiscal year 2024 to be $60 billion, a remarkable 126% increase from fiscal year 2023.
So why did I miss it?
For two years, I have been focused on the crypto space without looking outside or paying attention to the AI domain. I made a big mistake, and it still haunts me to this day.
But I won’t make the same mistake again.
Today, Crypto AI feels very similar. We are on the verge of an explosion of innovation. It is reminiscent of the California Gold Rush in the mid-19th century, which is hard to ignore—industries and cities sprang up overnight, infrastructure developed at a rapid pace, and wealth was created by those who dared to dream.
Just like early NVIDIA, in hindsight, Crypto AI will also be an obvious opportunity.
In the first part of this article, I will explain why Crypto AI is the most exciting opportunity for investors and builders today.
A simple overview is as follows:
Many still think it is a fantasy.
Crypto AI is still in its early stages, and it may be 1-2 years away from the peak of hype.
There is at least a $230 billion growth opportunity in this field.
Essentially, Crypto AI is AI built on cryptographic infrastructure. This means it is more likely to follow the exponential growth trajectory of AI rather than the broader crypto market. Therefore, to stay ahead, it is crucial to stay updated with the latest AI research on Arxiv and talk to founders who believe they are creating the next great products and services.
In the second part of this article, I will delve into the four most promising subfields of Crypto AI:
Decentralized Computing: Training, Inference, and the GPU Market
Data Network
Verifiable AI
On-chain AI Agents
To write this article, I spent several weeks conducting in-depth research and talking to founders and teams in the Crypto AI space, and this article is the culmination of those efforts. It does not delve into every single area in detail; rather, you can think of it as a high-level roadmap designed to spark your curiosity, elevate your research efforts, and guide your investment thinking.
1. Crypto AI Landscape
I depict the decentralized AI stack as a multi-layer ecosystem: it starts with decentralized computing and open data networks, supporting decentralized AI model training.
Then, using cryptography, cryptoeconomic incentives, and evaluation networks to verify each inference—both inputs and outputs. These verified output streams can run autonomously on-chain AI agents and AI applications for consumers and enterprises that users can truly trust.
The coordination network connects everything together, enabling seamless communication and collaboration across the entire ecosystem.
In this vision, anyone building AI can leverage one or multiple layers of this stack according to their specific needs. Whether using decentralized computing for model training or utilizing evaluation networks to ensure high-quality outputs, the stack provides a variety of options.
Due to the inherent composability of blockchain, I believe we will naturally gravitate towards a modular future. Each layer is becoming highly specialized, and protocols are optimized for different functionalities rather than adopting an integrated approach.
A large number of startups have gathered at every layer of the decentralized AI stack, most of which were founded in the past 1-3 years. It is clear that the field is still in its early stages.
The most comprehensive and up-to-date Crypto AI startup map I have seen is maintained by Casey and her team at topology.vc. It is an invaluable resource for anyone tracking this field.
As I delve into the subfields of Crypto AI, I constantly ask myself: how big are the opportunities? I am not interested in small change—I am looking for markets that can scale to hundreds of billions.
(1) Market Size
Let’s first look at the market size. When evaluating a niche, I ask myself: Is it creating a whole new market or disrupting an existing one?
Take decentralized computing as an example. This is a disruptive category whose potential can be evaluated by observing the existing cloud computing market, which is currently valued at around $680 billion and is expected to reach $2.5 trillion by 2032.
Unprecedented new markets, such as AI agents, are harder to quantify. Assessing them requires guessing based on no historical data and evaluating the problems they are solving. It’s worth noting that sometimes, what looks like a new market is merely an effort to find a solution to a problem.
(2) Timing
Timing is everything. Over time, technology tends to improve and become less expensive, but the pace of development varies.
How mature is the technology in a specific niche? Is it ready for scalable adoption, or is it still in the research phase, with practical applications requiring several more years? Timing determines whether an industry is worth immediate attention or ‘wait and see’.
Take fully homomorphic encryption (FHE) as an example: its potential is undeniable, but its development is currently too slow to be widely adopted. We may need a few more years to see it gain mainstream acceptance. By first focusing on areas closer to scalability, I can invest my time and energy in fields that are gaining momentum and opportunities.
If I were to map these categories to a scale versus time chart, it would look like this. Keep in mind that this is still a conceptual diagram, not a hard guideline. There are many nuances—for example, in verifiable inference, different methods (like zkML and opML) have different levels of readiness for use.
In other words, I believe the scale of AI will be so vast that even fields that seem 'niche' today may evolve into significant markets.
It is also worth noting that technological advancements do not always progress in a straight line—they often leap forward. When sudden breakthroughs occur, my perspectives on timing and market size will change.
With this framework, let’s take a closer look at the various subfields.
2. Field One: Decentralized Computing
Decentralized computing is the pillar of decentralized AI.
The GPU market, decentralized training, and decentralized inference are closely linked.
The supply side usually comes from small to medium-sized data centers and consumer GPUs.
While the demand side is small, it is still growing. Today, it comes from price-sensitive users and smaller AI startups that are less sensitive to latency.
The biggest challenge currently facing the Web3 GPU market is how to make them operate properly.
Coordinating GPUs on decentralized networks requires advanced engineering techniques and a well-designed, reliable network architecture.
2.1 GPU Market/Computing Network
Several Crypto AI teams are building decentralized networks to address the GPU shortage that cannot meet demand, leveraging global potential computing power.
The core value proposition of the GPU market has three aspects:
You can access computing at '90% lower' prices than AWS because there are no intermediaries and the supply side is open. Essentially, these markets allow you to leverage the world's lowest marginal computing costs.
Greater flexibility: no lock-in contracts, no KYC processes, no waiting times.
Censorship resistance
To address the supply-side issues in these markets, the computational power comes from:
Enterprise-grade GPUs from small to medium-sized data centers that are hard to find (like A100s, H100s), or those seeking diversification from Bitcoin mining. I also know some teams are focused on large government-funded infrastructure projects, where data centers have already been built as part of technology growth programs. These GPU providers are often incentivized to keep their GPUs on the network, which helps them offset the amortization costs of the GPUs.
Consumer-grade GPUs from millions of gamers and home users who connect their computers to the network in exchange for token rewards.
On the other hand, today’s demand for decentralized computing comes from:
Price-sensitive, latency-insensitive users. This niche prioritizes price over speed. Think of researchers exploring new fields, independent AI developers, and other cost-conscious users who do not require real-time processing. Many of them may be dissatisfied with traditional hyperscale servers (like AWS or Azure) due to budget constraints. Targeted marketing is crucial for attracting this group, as they are widely distributed.
Small AI startups that face challenges in obtaining flexible, scalable computing resources without signing long-term contracts with major cloud providers. Business development is key to attracting this niche as they are actively seeking alternatives to hyperscale lock-in.
Crypto AI startups that build decentralized AI products but lack their own computing supply will need to leverage the resources of one of these networks.
Cloud Gaming: While not directly driven by AI, the demand for GPU resources in cloud gaming is increasing.
The key point to remember is that developers always prioritize cost and reliability.
The real challenge lies in demand, not supply.
Startups in this field often consider the scale of their GPU supply networks as a marker of success. But this is misleading—it is at best a vanity metric.
The real constraint is not supply, but demand. The key metrics to track are not the number of available GPUs but the utilization and the actual number of GPUs rented out.
Tokens excel at guiding supply, creating the incentives needed for rapid scaling. However, they do not inherently solve the demand problem. The real test is to get the product to a good enough state to meet potential demand.
On this point, Haseeb Qureshi (Dragonfly) put it well:
Making computing networks truly work
Contrary to common belief, the biggest hurdle currently facing the web3 distributed GPU market is getting them to work properly.
This is not a trivial question.
Coordinating GPUs in a distributed network is extremely complex, with many challenges—resource allocation, dynamic workload scaling, load balancing between nodes and GPUs, latency management, data transfer, fault tolerance, and dealing with various hardware scattered across different geographical locations. I could go on.
Achieving this requires thoughtful engineering design and a reliable, well-designed network architecture.
To better understand, think of Google's Kubernetes. It is widely regarded as the gold standard for container orchestration, automating processes like load balancing and scaling in distributed environments, which is very similar to the challenges faced by distributed GPU networks. Kubernetes itself was built on over a decade of experience at Google, and even then, it took years of relentless iteration to perform well.
Some GPU computing markets that are already online can handle small-scale workloads, but once they attempt to scale, problems arise. I suspect this is due to poor foundational design in their architecture.
Another challenge/opportunity for decentralized computing networks is ensuring credibility: verifying that each node actually provides the claimed computational power. Currently, this relies on network reputation, and in some cases, computing power providers are ranked based on reputation scores. Blockchain seems well-suited for trustless verification systems. Startups like Gensyn and Spheron are striving to address this issue using a trustless approach.
Today, many web3 teams are still grappling with these challenges, which means the door of opportunity is wide open.
Decentralized Computing Market Size
How big is the decentralized computing network market?
Today, it may just be a small part of the cloud computing industry valued at $680 billion to $2.5 trillion. However, despite the friction added for users, there will always be demand as long as costs are lower than traditional providers.
I believe that due to token subsidies and the unlocking of supply for price-insensitive users, costs will remain low in the medium to short term (for example, if I can rent out my gaming laptop for extra cash, whether it’s $20 or $50 a month, I would be very happy).
However, the true growth potential of decentralized computing networks—and the true expansion of their TAM—will emerge in the following situations:
Decentralized training of AI models becomes practical.
The demand for inference is surging, and existing data centers cannot meet it. This situation is already starting to show. Jensen Huang stated that inference demand will grow 'a billion-fold'.
Appropriate service level agreements (SLA) become available, addressing a significant barrier to enterprise adoption. Currently, decentralized computing's performance leaves users experiencing different levels of service quality (e.g., uptime percentages). With SLAs, these networks can offer standardized reliability and performance metrics, making decentralized computing a viable alternative to traditional cloud computing providers.
Decentralized permissionless computing is the foundational layer of the decentralized AI ecosystem—its infrastructure.
Although the GPU supply chain is continuously expanding, I believe we are still at the dawn of the age of human intelligence. The demand for computing will be unfulfilled.
Keep an eye out for potential turning points that could trigger a complete re-evaluation of all GPU markets, which may come soon.
Other considerations:
The pure GPU market is crowded, with fierce competition between decentralized platforms and the emergence of new web2 AI cloud services (like the rise of Vast.ai and Lambda).
The demand for small nodes (like 4 x H100s) is not high due to their limited use, but good luck finding someone to sell large clusters—they still have some demand.
Will a dominant player aggregate all computational power supplies for decentralized protocols or maintain a dispersed supply across multiple markets? I lean towards the former, as consolidation typically enhances infrastructure efficiency. But this takes time, and meanwhile, fragmentation and chaos continue.
Developers want to focus on application development, not on deployment and configuration hassles. Markets must abstract these complexities to make computing access as frictionless as possible.
2.2 Decentralized Training
If the scaling law holds, then training the next generation of frontier AI models in a single data center will one day become impossible.
Training AI models requires transferring vast amounts of data between GPUs. The lower data transfer (interconnect) speeds between distributed GPUs are often the biggest hurdle.
Researchers are synchronously exploring multiple methods and are making breakthroughs (for example, Open DiLoCo, DisTrO). These advances will build upon each other, accelerating progress in the field.
The future of decentralized training may hinge on designing small, dedicated models for niche applications rather than frontier, AGI-centered models.
As the shift towards models like OpenAI o1 occurs, the demand for inference will soar, creating opportunities for decentralized inference networks.
Imagine this: a massive, world-changing AI model not developed in secret elite labs but shaped by millions of ordinary people. The GPUs of gamers, which often create cinematic explosions (like in Call of Duty), are now being lent to something grander—a collective-owned, open-source AI model with no central gatekeeper.
In such a future, foundation-scale models will not be confined to top AI labs.
But let’s root this vision in the current reality. Currently, heavyweight AI training remains concentrated in centralized data centers, which may persist for some time.
Companies like OpenAI are expanding their vast clusters. Elon Musk recently announced that xAI is set to build a data center equivalent to 200,000 H100 GPUs.
But this is not just about raw GPU counts. Model FLOPS utilization (MFU) is a metric introduced by Google in their 2022 PaLM research paper, which tracks the efficiency of using GPUs at maximum capacity. Surprisingly, MFU typically hovers around 35-40%.
Why so low? According to Moore’s Law, GPU performance has surged dramatically in recent years, but improvements in networking, memory, and storage have lagged significantly, creating bottlenecks. As a result, GPUs often sit idle, waiting for data.
Today’s AI training remains highly centralized due to one word—efficiency.
Training large models depends on the following technologies:
Data Parallelism: Splitting datasets across multiple GPUs for parallel operations, accelerating the training process.
Model Parallelism: Distributing parts of the model across multiple GPUs, circumventing memory constraints.
These methods require GPUs to constantly exchange data, making interconnect speed—the rate at which data transfers across computers in the network—critically important.
When the costs of training cutting-edge AI models exceed $1 billion, every efficiency gain is crucial.
Through high-speed interconnects, centralized data centers can quickly transfer data between GPUs and save significant costs during training time, which decentralized setups cannot match.
Overcoming slow interconnect speeds
If you talk to people working in the AI field, many will tell you that decentralized training simply doesn’t work.
In a decentralized setup, GPU clusters are not physically co-located, thus transferring data between them is much slower and becomes a bottleneck. Training requires GPUs to synchronize and exchange data at each step. The farther apart they are, the higher the latency. Higher latency means slower training speeds and higher costs.
In centralized data centers, it may take days, while in decentralized data centers it could stretch to weeks, with costs being higher. This is simply not feasible.
But this is about to change.
The good news is that there has been a surge of research interest in distributed training. Researchers are exploring multiple approaches simultaneously, and a wealth of research and published papers attests to this. These advances will build upon each other, accelerating progress in the field.
This also pertains to production environment testing, seeing to what extent we can push the boundaries.
Some decentralized training techniques are already capable of handling smaller models in slow interconnect environments. Currently, cutting-edge research is pushing these methods for application in large models.
For example, Prime Intellect’s open-source DiCoLo article demonstrates a practical approach involving GPU 'islands' that perform 500 local steps before synchronizing, reducing bandwidth requirements by 500 times. From Google DeepMind's initial research on small models, it has expanded to training models with 10 billion parameters within months and is now fully open-source.
Nous Research is raising the bar with their DisTrO framework, which reduces communication requirements between GPUs by a staggering 10,000 times while training a 1.2B parameter model.
And this momentum is only increasing. Last December, Nous announced a pre-training of a 15B parameter model that matched or even exceeded the typical results of centralized training in terms of loss curve (how model error decreases over time) and convergence rate (the speed at which model performance stabilizes). Yes, better than centralized.
SWARM Parallelism and DTFMHE are other distinct methods for training large AI models across different types of devices, even if those devices have varying speeds and connectivity levels.
Managing a diverse range of GPU hardware is another major challenge, including the typical memory-constrained consumer GPUs in decentralized networks. Techniques like model parallelism (dividing model layers across devices) can help achieve this.
The Future of Decentralized Training
Currently, the model scales of decentralized training methods are still far below frontier models (GPT-4 is reportedly close to a trillion parameters, 100 times larger than Prime Intellect’s 10B model). To achieve true scalability, we need breakthroughs in model architecture, better network infrastructure, and smarter cross-device task allocation.
We can dream big. Imagine a world where the GPU computing power gathered through decentralized training exceeds even that of the largest centralized data centers.
Pluralis Research (an elite team focused on decentralized training that is worth watching closely) believes this is not only possible but inevitable. Centralized data centers are limited by physical conditions like space and power availability, while decentralized networks can tap into a truly limitless global resource pool.
Even Jensen Huang of NVIDIA admits that asynchronous decentralized training can unleash the true potential of AI scaling. Distributed training networks are also more fault-tolerant.
Thus, in a possible future world, the most powerful AI models globally will be trained in a decentralized manner.
This is an exciting prospect, but I am not fully convinced yet. We need stronger evidence to demonstrate that decentralized training of the largest models is both technically and economically feasible.
I see enormous hope in this: the best aspect of decentralized training may lie in designing small, dedicated open-source models for target use cases, rather than competing with the ultra-large AGI-driven frontier models. Certain architectures, especially non-transformer models, have proven to be well-suited for decentralized setups.
This puzzle has another part: tokens. Once decentralized training becomes feasible at scale, tokens can play a key role in incentivizing and rewarding contributors, effectively guiding these networks.
The road to realizing this vision is still long, but progress is encouraging. As future model sizes will exceed the capacity of a single data center, advancements in decentralized training will benefit everyone, including large tech companies and top AI research labs.
The future is distributed. When a technology has such broad potential, history shows that it often performs better and faster than anyone expects.
2.3. Decentralized Inference
Currently, most of AI's computing power is concentrated on training large-scale models. Top AI labs are in a race to develop the best foundational models and ultimately achieve AGI.
But my view is that in the coming years, this focus on training will shift to inference. As AI becomes more integrated into the applications we use daily—from healthcare to entertainment—the amount of computational resources required to support inference will be staggering.
This is not just speculation. Inference-time compute scaling is the latest buzzword in AI. Did OpenAI recently release a preview/mini version of its latest model 01 (codename: Strawberry), marking a significant shift? It’s time to think carefully, first asking myself what steps should be taken to answer this question, and then proceeding step by step.
This model is designed for more complex tasks that require extensive planning, such as crossword puzzles, as well as deeper reasoning questions. You will notice it slows down, taking more time to generate responses, but the results are more thoughtful and nuanced. Its running costs are also significantly higher (25 times that of GPT-4).
The shift in focus is clear: the next leap in AI performance will not only come from training larger models but from scaling computing applications during inference.
If you want to learn more, some research papers indicate:
Scaling inference computation through resampling can yield significant improvements across various tasks.
There is also a scaling law for inference.
Once powerful models are trained, their inference tasks—the things the model does—can be transferred to decentralized computing networks. This makes sense because:
The resources required for inference are much less than for training. Once trained, models can be compressed and optimized using techniques such as quantization, pruning, or distillation. They can even be broken down to run on everyday consumer devices. You don’t need high-end GPUs to support inference.
This has already happened. Exo Labs has found a way to run the 450B parameter Llama3 model on consumer-grade hardware like MacBooks and Mac Minis. Distributed inference across multiple devices can efficiently and economically handle large-scale workloads.
Better user experience. Running computing closer to users can reduce latency, which is critical for real-time applications like gaming, AR, or autonomous vehicles. Every millisecond counts.
Imagine decentralized inference as the CDN (Content Delivery Network) of AI: decentralized inference utilizes local computing power to deliver AI responses in record time, rather than quickly serving websites by connecting to nearby servers. By adopting decentralized inference, AI applications become more efficient, responsive, and reliable.
The trend is clear. Apple’s newly launched M4 Pro chip competes with NVIDIA’s RTX 3070 Ti, which until recently was the domain of hardcore gamers. Our hardware is becoming increasingly capable of handling advanced AI workloads.
The appreciation of Crypto
For decentralized inference networks to succeed, compelling economic incentives are essential. Nodes in the network need to be compensated for their contributions of computational power. This system must ensure rewards are distributed fairly and effectively. Geographic diversity is necessary to reduce latency in inference tasks and improve fault tolerance.
What is the best way to build decentralized networks? Crypto.
Tokens provide a powerful mechanism to coordinate participants' interests, ensuring everyone works towards the same goal: to expand the network and increase token value.
Tokens also accelerate the growth of the network. They help solve the classic chicken-and-egg problem that hinders the development of most networks by rewarding early adopters and driving participation from day one.
The success of Bitcoin and Ethereum proves this point—they have already gathered the largest pools of computing power on Earth.
Decentralized inference networks will be next. Due to geographic diversity, they reduce latency and improve fault tolerance, bringing AI closer to users. Under crypto incentives, they will scale faster and better than traditional networks.
(To be continued, stay tuned)