Author: Teng Yan, Chain of Thought; Translation: Golden Finance xiaozou
One of my biggest regrets that still haunts me is that it was undoubtedly the most obvious investment opportunity to anyone paying attention, but I didn’t put a penny in. No, I’m not talking about the next Solana killer or the dog meme coin with a funny hat.
But...NVIDIA.
In just one year, NVDA's market value soared from $1 trillion to $3 trillion, a 3-fold increase, even surpassing Bitcoin in the same period.
Of course, there is AI hype involved, but a lot of it is grounded in reality. NVIDIA reported revenue of $60 billion for fiscal 2024, an astonishing 126% increase from fiscal 2023.
So why did I miss it?
For two years, I was so focused on crypto that I didn’t look outside the box and didn’t pay attention to AI. It was a huge mistake that still haunts me today.
But I won't make the same mistake again.
Crypto AI feels very similar today. We are on the verge of a huge explosion of innovation. It is too similar to the California Gold Rush in the mid-1800s to ignore - industries and cities have sprung up overnight, infrastructure has developed at a breakneck pace, and wealth has been created by those who dare to think and act.
Like NVIDIA in its early days, Crypto AI will be an opportunity that will be obvious in hindsight.
In the first part of this article, I will explain why Crypto AI is the most exciting underdog opportunity for investors and builders today.
A brief overview is as follows:
Many people still think it is fantasy.
Crypto AI is still in its early stages and may be 1-2 years away from peak hype.
There is at least $230 billion in growth opportunities in this area.
Essentially, Crypto AI is AI built on crypto infrastructure. This means it is more likely to follow the exponential growth trajectory of AI than the broader crypto market. Therefore, in order to keep up with the latest AI research on Arxiv and talk to founders who believe they are creating the next amazing products and services.
In the second part of this article, I will delve deeper into four of the most promising subfields in Crypto AI:
Decentralized computing: training, inference, and the GPU market
Data Network
Verifiable AI
AI Agents Running on Chain
I spent several weeks doing in-depth research and talking to founders and teams in the Crypto AI space to write this article, and this article is the culmination of these efforts. This article is not intended to be an exhaustive dive into every area, but rather, you can think of it as a high-level roadmap designed to spark your curiosity, improve your research, and guide your investment thinking.
1. Crypto AI Landscape
I picture the decentralized AI stack as a multi-layered ecosystem: one end starts with decentralized compute and open data networks that support decentralized AI model training.
Each inference is then verified using a combination of cryptography, cryptoeconomic incentives, and evaluation networks—both input and output. These verified outputs flow to AI agents that can run autonomously on-chain, as well as consumer and enterprise AI applications that users can truly trust.
The orchestration network ties everything together, enabling seamless communication and collaboration across the ecosystem.
In this vision, anyone building AI can leverage one or more layers of this stack, depending on their specific needs. Whether leveraging decentralized computing for model training or using evaluation networks to ensure high-quality output, the stack offers a range of options.
Due to the inherent composability of blockchain, I believe we are naturally heading towards a modular future. Each layer is becoming highly specialized, with protocols optimized for different functions, rather than taking an all-in-one integrated approach.
There is a large concentration of startups at each layer of the decentralized AI stack, most of which were founded in the past 1-3 years. It is clear that the field is still in its early stages.
The most comprehensive and up-to-date Crypto AI startup map I’ve seen is maintained by Casey and her team at topology.vc. It’s an invaluable resource for anyone tracking the space.
As I delved deeper into the Crypto AI sub-field, I kept asking myself: How big is the opportunity? I wasn’t interested in small splurges — I was looking for markets that could reach hundreds of billions of dollars.
(1) Market size
Let’s start with market size. When evaluating a segment, I ask myself: Is it creating an entirely new market or disrupting an existing one?
Take decentralized computing, for example. This is a disruptive category whose potential can be assessed by looking at the existing cloud computing market, which is currently valued at around $680 billion and is expected to reach $2.5 trillion by 2032.
New markets that have never been seen before, such as artificial intelligence agents, are more difficult to quantify. In the absence of historical data, their evaluation requires guesswork and an assessment of the problem they are solving. It is important to note that sometimes, what looks like a new market is actually just a solution struggling to find a problem.
(2) Timing
Timing is everything. Technology tends to improve and become less expensive over time, but the pace of development varies.
How mature is the technology in a particular niche? Is it ready for mass adoption, or is it still in the research phase, with real-world applications still years away? Timing determines whether an industry deserves immediate attention or a “wait and see” approach.
Take fully homomorphic encryption (FHE): its potential is undeniable, but its current development rate is still too slow to be widely used. We may still need a few years before it sees mainstream adoption. By focusing on areas that are closer to scale first, I can spend my time and energy on areas that are gathering momentum and opportunity.
If I were to map these categories onto a scale vs. time chart, it would look something like this. Keep in mind that this is still a conceptual diagram, not a hard and fast guide. There are a lot of nuances — for example, in verifiable reasoning, different approaches like zkML and opML have different levels of readiness for use.
That said, I believe AI will grow so large that even areas that seem “niche” today could evolve into a significant market.
It’s also worth noting that technological progress doesn’t always proceed in a straight line — it often comes in spurts. When there are spurts, my view on timing and market size will change.
With this framework in place, let’s look at each subfield in more detail.
2. Area 1: Decentralized Computing
Decentralized computing is the backbone of decentralized artificial intelligence.
The GPU market, decentralized training, and decentralized inference are closely linked.
The supply side typically comes from small and medium-sized data centers and consumer GPUs.
Demand is small but growing. Today, it comes from price-sensitive, latency-insensitive users and smaller AI startups.
The biggest challenge facing the Web3 GPU market right now is how to make them work properly.
Coordinating GPUs on a decentralized network requires advanced engineering and a well-designed, reliable network architecture.
2.1 GPU Market/Computing Network
Several Crypto AI teams are working to build decentralized networks to harness the world’s potential computing power in response to the GPU shortage that cannot meet demand.
The core value proposition of the GPU market has three aspects:
You can access compute at prices “90% lower” than AWS because there are no middlemen and the supply side is open. Essentially, these marketplaces allow you to take advantage of the lowest marginal compute costs around the world.
Greater flexibility: no lock-in contracts, no KYC processes, no waiting times.
Censorship resistance
To solve the supply side of the market, the computing power of these markets comes from:
Enterprise-grade GPUs (e.g. A100, H100) for small to medium data centers that are hard to find, or Bitcoin miners looking to diversify. I also know of teams working on large government-funded infrastructure projects where data centers have been built as part of technology growth initiatives. These GPU providers are often incentivized to keep their GPUT on the network, which helps them offset the amortized cost of the GPUs.
Millions of consumer-grade GPUs for gamers and home users who connect their computers to the network in exchange for token rewards.
On the other hand, today’s demand for decentralized computing comes from:
Price-sensitive, latency-insensitive users. This market segment prioritizes price over speed. Think researchers exploring new areas, independent AI developers, and other cost-conscious users who don’t need real-time processing. Many of them may not be satisfied with traditional hyperscale servers such as AWS or Azure due to budget constraints. Because they are widely distributed among the population, targeted marketing is crucial to attract this group.
Small AI startups face the challenge of obtaining flexible, scalable computing resources without long-term contracts with major cloud providers. Business development is critical to attracting this segment as they actively seek alternatives to hyperscale lock-in.
Crypto AI startups that build decentralized AI products but don’t have their own computing power supply will need to tap into the resources of one of these networks.
Cloud gaming: Although not directly driven by AI, cloud gaming is increasingly demanding GPU resources.
The key thing to remember is that developers always prioritize cost over reliability.
The real challenge lies in demand, not supply.
Startups in this space often point to the size of their GPU supply network as a sign of success. But this is misleading — it’s a vanity metric at best.
The real constraint is not supply, but demand. The key metric to track is not the number of GPUs available, but rather the utilization and number of GPUs actually rented out.
Tokens are great at bootstrapping supply, creating the incentives needed to scale quickly. However, they don’t inherently solve the demand problem. The true test is getting the product to a state where it’s good enough to realize latent demand.
Haseeb Qureshi (Dragonfly) puts it well on this point:
Making computing networks actually work
Contrary to popular belief, the biggest hurdle the web3 distributed GPU market currently faces is simply getting them to work properly.
This is not a trivial issue.
Coordinating GPUs in a distributed network is extremely complex and there are many challenges – resource allocation, dynamic workload scaling, load balancing across nodes and GPUs, latency management, data transfer, fault tolerance, and dealing with a wide variety of hardware spread across geographical locations. I could go on.
Achieving this requires thoughtful engineering and a solid, well-designed network architecture.
To better understand, think about Google's Kubernetes. It is widely regarded as the gold standard for container orchestration, automating processes such as load balancing and scaling in distributed environments, which are very similar to the challenges faced by distributed GPU networks. Kubernetes itself is built on more than a decade of experience at Google, and even then it took years of relentless iteration to perform well.
Some of the GPU computing markets that have come online can handle small workloads, but once they try to scale, they have problems. I suspect this is because their architectural foundations are poorly designed.
Another challenge/opportunity for decentralized computing networks is ensuring trustworthiness: verifying that each node actually provides the computing power it claims. Currently, this relies on network reputation, and in some cases, computing power providers are ranked based on reputation scores. Blockchains seem well suited to trustless verification systems. Startups like Gensyn and Spheron are seeking to solve this problem using a trustless approach.
Many web3 teams are still tackling these challenges today, which means the door of opportunity is open.
Decentralized computing market size
How big is the market for decentralized computing networks?
Today, it may be just a small part of the $680 billion to $2.5 trillion cloud computing industry. However, despite the added friction for users, there will always be demand as long as the cost is lower than that of traditional providers.
I believe costs will remain low in the short to medium term due to token subsidies and supply unlocking for price-insensitive users (e.g. if I can rent out my gaming laptop to earn extra cash, whether it's $20 or $50 a month, I'll be happy).
But the real growth potential of decentralized computing networks — and the true expansion of their TAM — will occur when:
Decentralized training of AI models becomes practical.
The demand for inference is surging, and existing data centers can no longer keep up with it. This is already starting to show. Jensen Huang said that the demand for inference will grow "a billion times."
Appropriate service level agreements (SLAs) are becoming available, addressing a key barrier to enterprise adoption. Currently, decentralized computing operations provide users with varying levels of service quality (e.g., uptime percentage). With SLAs, these networks can provide standardized reliability and performance metrics, making decentralized computing a viable alternative to traditional cloud computing providers.
Decentralized permissionless computing is the foundational layer — the infrastructure — of the decentralized AI ecosystem.
While the GPU supply chain is expanding, I believe we are at the dawn of the age of human intelligence. The demand for computing will be insatiable.
It is important to note that an inflection point that could trigger a re-rating of the entire operating GPU market may arrive soon.
Other notes:
The pure GPU market is crowded, with fierce competition among decentralized platforms and the rise of web2 AI emerging cloud services such as Vast.ai and Lambda.
There isn't a huge demand for small nodes (like 4 x H100) as they have limited use, but good luck finding someone selling large clusters - they are still in demand.
Will one dominant player aggregate all hashrate supply for decentralized protocols, or will hashrate remain dispersed across multiple markets? I lean toward the former, as consolidation generally leads to greater infrastructure efficiency. But it will take time, and in the meantime, fragmentation and confusion continue.
Developers want to focus on application development rather than dealing with deployment and configuration. Marketplaces must abstract these complexities and make access to compute as frictionless as possible.
2.2 Decentralized Training
If the law of scaling holds true, then training the next generation of cutting-edge AI models in a single data center will one day become impossible.
Training AI models requires transferring large amounts of data between GPUs. The low data transfer (interconnection) speed between distributed GPUs is often the biggest obstacle.
Researchers are exploring multiple approaches simultaneously and are making breakthroughs (e.g. Open DiLoCo, DisTrO). These advances will add up and accelerate progress in the field.
The future of decentralized training may lie in designing small, specialized models for niche applications rather than cutting-edge, AGI-centric models.
With the move to models like OpenAI o1, inference needs will skyrocket, creating opportunities for decentralized inference networks.
Imagine this: a massive, world-changing AI model developed not in secret elite labs, but shaped by millions of ordinary people. Gamers whose GPUs normally create theatrical explosions of (Call of Duty) are now lending their hardware to something much bigger — an open-source, collectively owned AI model with no central gatekeeper.
In such a future, foundation-scale models are not limited to top AI labs.
But let’s ground this vision in the reality of the present. Right now, the bulk of heavyweight AI training is still concentrated in centralized data centers, and this will likely be the norm for some time.
Companies like OpenAI are expanding their massive clusters, and Elon Musk recently announced that xAI is about to build a data center with the equivalent of 200,000 H100 GPUs.
But it's not just about raw GPU counts. Model FLOPS Utilization (MFU), a metric proposed by Google in its 2022 PaLM research paper, tracks how efficiently a GPU's maximum capacity is being used. Surprisingly, MFU typically hovers between 35-40%.
Why is it so low? According to Moore's Law, the performance of GPUs has suddenly soared in the past few years, but the improvements in network, memory, and storage have lagged behind significantly, forming a bottleneck. As a result, GPUs are often in a limited state, waiting for data.
Today's AI training is still highly centralized because of one word - efficiency.
Training large models depends on the following techniques:
Data parallelism: Split the dataset across multiple GPUs to perform operations in parallel, speeding up the training process.
Model parallelism: Distribute parts of the model across multiple GPUs to bypass memory constraints.
These approaches require GPUs to constantly exchange data, making interconnect speed — the rate at which data can be transferred across computers in a network — critical.
When the cost of training cutting-edge AI models exceeds $1 billion, every efficiency gain counts.
Through high-speed interconnects, centralized data centers are able to quickly transfer data between GPUs and achieve significant cost savings in training time, which is unmatched by decentralized settings.
Overcoming slow interconnect speeds
If you talk to anyone working in the AI field, many will tell you that decentralized training simply doesn’t work.
In a decentralized setting, GPU clusters are not physically co-located, so transferring data between them is much slower and becomes a bottleneck. Training requires GPUs to synchronize and exchange data at every step. The farther apart they are, the higher the latency. Higher latency means slower training and higher costs.
What might take days in a centralized data center might stretch to two weeks in a decentralized data center, and cost more. It’s simply not feasible.
But that’s about to change.
The good news is that there has been a surge of interest in distributed training. Researchers are exploring multiple approaches simultaneously, as evidenced by the large number of studies and published papers. These advances will add up and accelerate progress in the field.
It’s also about testing in production and seeing how far we can push the boundaries.
Some decentralized training techniques have been developed to handle smaller models on slow interconnects, and cutting-edge research is now advancing the application of these methods to larger models.
For example, Prime Intellect's open source DiCoLo paper shows a practical approach that involves GPU "islands" performing 500 local steps before syncing, cutting bandwidth requirements by a factor of 500. What started as Google DeepMind research on small models has scaled to training models with 10 billion parameters in 11 months, and is now fully open source.
Nous Research is raising the bar with their DisTrO framework, which uses an optimizer to reduce inter-GPU communication requirements by an astounding 10,000x while training a 1.2B parameter model.
And the momentum is only growing. Last December, Nous announced pre-training of a 15B parameter model with loss curves (how the model error decreases over time) and convergence rates (how quickly the model performance stabilizes) that match or even beat typical results from centralized training. Yes, better than centralized.
SWARM Parallelism and DTFMHE are other different approaches to training large AI models across different types of devices, even if those devices have different speeds and connectivity levels.
Managing the wide variety of GPU hardware is another challenge, including the memory-constrained consumer GPUs typical in decentralized networks. Techniques like model parallelism (partitioning model layers across devices) can help achieve this.
The future of decentralized training
Model sizes for current decentralized training methods are still far below cutting-edge models (GPT-4 reportedly has nearly a trillion parameters, 100x larger than Prime Intellect’s 10B model). To achieve true scale, we will need breakthroughs in model architecture, better network infrastructure, and smarter distribution of tasks across devices.
We can dream big. Imagine a world where decentralized training can aggregate more GPU computing power than even the largest centralized data center can muster.
Pluralis Research (a crack team focused on decentralized training that's worth keeping an eye on) believes this is not only possible, but inevitable. Centralized data centers are limited by physical constraints such as space and power availability, while decentralized networks can tap into a truly unlimited global pool of resources.
Even NVIDIA's Jensen Huang admits that asynchronous decentralized training can unlock the true potential of AI scaling. Distributed training networks are also more fault-tolerant.
Therefore, in a possible future world, the world’s most powerful AI models will be trained in a decentralized manner.
This is an exciting prospect, but I’m not fully convinced yet. We need stronger evidence that decentralized training of the largest models is technically and economically feasible.
I see great promise in this: the best of decentralized training may lie in small, specialized open source models designed for target use cases, rather than competing with super-large AGI-driven cutting-edge models. Certain architectures, especially non-transformer models, have proven to be well suited to decentralized settings.
There is another piece to the puzzle: tokens. Once decentralized training becomes feasible at scale, tokens can play a critical role in incentivizing and rewarding contributors, effectively bootstrapping these networks.
The road to this vision is long, but progress is encouraging. As future models scale beyond the capacity of a single data center, advances in decentralized training will benefit everyone, even large tech companies and top AI research labs.
The future is distributed. When a technology has such widespread potential, history shows it always works better and faster than anyone expects.
2.3. Decentralized Reasoning
Currently, most of the computing power for artificial intelligence is focused on training large-scale models. There is a race among top artificial intelligence labs to see who can develop the best basic models and ultimately achieve AGI.
But here’s my take: In the next few years, this focus on training will shift to inference. As AI becomes more integrated into the applications we use every day — from healthcare to entertainment — the amount of computing resources required to support inference will be staggering.
This is not just speculation. Inference-time compute scaling is the latest buzzword in the field of artificial intelligence. OpenAI recently released a preview/mini version of its latest model 01 (codename: Strawberry). Is this a major shift? Take the time to think about it, first ask yourself what steps you should take to answer this question, and then proceed step by step.
This model is designed for more complex, planning-intensive tasks, like crossword puzzles, and questions that require deeper reasoning. You'll notice that it's slower and takes more time to generate a response, but the results are more thoughtful and nuanced. It's also much more expensive to run (25 times more than GPT-4).
The shift in emphasis is clear: the next leap in AI performance will come not just from training larger models, but from scaling up the application of compute during inference.
If you want to learn more, here are some research articles:
Scaling inference computation by repeated sampling can achieve large improvements across a variety of tasks.
There is also an exponential law of expansion for inference.
Once powerful models are trained, their inference tasks — what the models do — can be offloaded to decentralized computing networks. This makes sense because:
Inference requires far fewer resources than training. After training, models can be compressed and optimized using techniques such as quantization, pruning, or distillation. They can even be broken down to run on everyday consumer devices. You don't need a high-end GPU to power inference.
This is already happening. Exo Labs has figured out how to run a 450B parameter Llama3 model on consumer hardware like MacBooks and Mac Minis. Distributing inference across multiple devices can handle large-scale workloads efficiently and cost-effectively.
Better user experience. Running computations closer to the user reduces latency, which is critical for real-time applications like gaming, AR, or self-driving cars. Every millisecond counts.
Think of decentralized reasoning as a CDN (content distribution network) for AI: Instead of connecting to nearby servers to quickly serve up websites, decentralized reasoning leverages local computing power to deliver AI responses in record time. By adopting decentralized reasoning, AI applications become more efficient, more responsive, and more reliable.
The trend is clear. Apple's new M4 Pro chip competes with Nvidia's RTX 3070 Ti, which until recently was the domain of hardcore gamers. Our hardware is increasingly capable of handling advanced AI workloads.
The Value Added by Crypto
For decentralized inference networks to succeed, there must be compelling economic incentives. Nodes in the network need to be compensated for their computational contributions. The system must ensure that rewards are distributed fairly and efficiently. Geographic diversity is necessary to reduce latency in inference tasks and improve fault tolerance.
What is the best way to build a decentralized network? Crypto.
Tokens provide a powerful mechanism to align the interests of participants, ensuring that everyone is working towards the same goal: scaling the network and increasing the value of the token.
Tokens also accelerate network growth. They help solve the classic chicken-and-egg problem that holds back most networks by rewarding early adopters and driving engagement from day one.
The success of Bitcoin and Ethereum proves this - they have gathered the largest pool of computing power on the planet.
Decentralized inference networks will be next. With geographical diversity, they reduce latency, improve fault tolerance, and bring AI closer to users. Crypto-incentivized, they will scale faster and better than traditional networks.
(To be continued, please stay tuned)