By Ed Roman, Managing Partner at Hack VC

Compiled by: 1912212.eth, Foresight News

AI+Crypto is one of the frontier areas that has attracted much attention in the cryptocurrency market recently, such as decentralized AI training, GPU DePINs, and censorship-resistant AI models.

Behind these dazzling developments, we can't help but ask: Is this a real technological breakthrough or just a hype? This article will help you clear the fog, analyze the encryption x AI vision and discuss the real challenges and opportunities, and reveal which are empty promises and which are actually feasible?

Scenario #1: Decentralized AI Training

The problem with on-chain AI training is that it requires high-speed communication and coordination between GPUs, since neural networks need backpropagation when training. Nvidia has two innovations for this (NVLink and InfiniBand). These technologies make GPU communication super fast, but they are local-only technologies and only work for GPU clusters located within a single data center (50+ Gigabit speeds).

If you introduce a decentralized network, things suddenly become orders of magnitude slower due to increased network latency and bandwidth. Compared to the throughput you get from Nvidia’s high-speed interconnects within a data center, such speeds are simply not possible for AI training use cases.

Note that there are also innovations below that may offer hope for the future:

  • Distributed training is happening on InfiniBand at scale, as NVIDIA itself is supporting distributed non-local training on InfiniBand through the NVIDIA Collective Communications Library. However, it is still nascent, so adoption metrics are yet to be determined. The physical laws bottleneck over distance still exists, so local training on InfiniBand is still much faster.

  • Some new research has been published on decentralized training that reduces communication synchronization time, which may make decentralized training more practical in the future.

  • Smart sharding and scheduling of model training can help improve performance. Likewise, new model architectures may be designed specifically for future distributed infrastructures (Gensyn is conducting research in these areas).

The data part of training is also challenging. Any AI training process involves processing large amounts of data. Typically, models are trained on centralized secure data storage systems with high scalability and performance. This requires transferring and processing terabytes of data, and it is not a one-time cycle. Data is often noisy and contains errors, so it must be cleaned and converted into a usable format before training the model. This stage involves repetitive tasks of standardization, filtering, and handling missing values. These face serious challenges in a decentralized environment.

The data part of training is also iterative, which is not compatible with Web3. OpenAI went through thousands of iterations to achieve its results. In the AI ​​team, the most basic task scenario of data scientists includes defining goals, preparing data, analyzing and organizing data to extract important insights, and making it suitable for modeling. Then, a machine learning model is developed to solve the defined problem and its performance is verified using a test data set. This process is iterative: if the current model does not perform as expected, the expert will return to the data collection or model training stage to improve the results. Imagine if this process is carried out in a decentralized environment, it will not be easy to adapt the most advanced existing frameworks and tools in Web3.

Another problem with training AI models on-chain is that this market is much less interesting than inference. Currently, training large AI language models requires a lot of GPU computing resources. In the long run, inference will become the main application scenario for GPUs. Imagine how many large AI language models need to be trained to meet global demand, and which one is greater compared to the number of customers using these models?

Scenario #2: Using Overly Redundant AI Reasoning to Reach Consensus

Another challenge with encryption and AI is verifying the accuracy of AI reasoning, because you cannot fully trust a single centralized party to perform reasoning operations, and there is a potential risk that nodes may behave improperly. This challenge does not exist in Web2 AI because there is no decentralized consensus system.

The solution is redundant computing, allowing multiple nodes to repeat the same AI reasoning operations, so that it can run in a trustless environment and avoid single points of failure.

The problem with this approach, however, is that high-end AI chips are in extremely short supply. The waiting time for high-end NVIDIA chips is years, which drives up the price. If you require AI inference to be re-executed multiple times on multiple nodes, the cost increases exponentially, which is not feasible for many projects.

Scenario #3: Near-term Web3-specific AI use cases

Some have suggested that Web3 should have its own unique AI use cases specifically for Web3 clients. This could be (for example) a Web3 protocol that uses AI to risk-score DeFi pools, a Web3 wallet that suggests new protocols to users based on their wallet history, or a Web3 game that uses AI to control non-player characters (NPCs).

Currently, this is a start-up market (in the short term) where use cases are still being explored. Some challenges include:

  • As market demand is still in its infancy, there are fewer potential AI transactions required for Web3 native use cases.

  • There are fewer customers, orders of magnitude fewer Web3 customers than Web2 customers, so the market is less decentralized.

  • The customers themselves are less stable as they are startups with less funding, and some startups may die over time. A Web3 AI service provider that caters to Web3 customers may need to regain part of their customer base to replace those that have disappeared, making it extremely challenging to scale the business.

Long term, we are very bullish on Web3 native AI use cases, especially as AI agents become more common. We imagine a future where any given Web3 user will have a large number of AI agents helping them complete tasks.

Scenario #4: Consumer GPU DePIN

There are many decentralized AI computing networks that rely on consumer-grade GPUs instead of data centers. Consumer GPUs are great for low-end AI inference tasks or consumer use cases where latency, throughput, and reliability are flexible. But for serious enterprise use cases (which is the majority of the important market), customers need higher reliability networks than home machines, and often higher-end GPUs if they have more complex inference tasks. Data centers are better suited for these more valuable customer use cases.

Note that we believe consumer-grade GPUs are suitable for demonstrations, and individuals and startups that can tolerate lower reliability. But these customers are lower value, so we believe DePINs tailored for Web2 enterprises will be more valuable in the long run. As a result, the GPU DePIN project has evolved from using mainly consumer-grade hardware in the early days to having A100/H100 and cluster-level availability.

Reality — Real Use Cases of Crypto x AI

Now let’s talk about use cases that provide real benefits. These are the real wins, where crypto x AI can add clear value.

Real Benefit #1: Serving Web2 Clients

McKinsey estimates that across the 63 use cases analyzed, generative AI could add the equivalent of $2.6 trillion to $4.4 trillion in revenue per year—for comparison, the UK’s total GDP of $3.1 trillion in 2021. This would increase AI’s impact by 15% to 40%. If we factor in the impact of generative AI being embedded in software for tasks other than those currently used in these use cases, the estimated impact is roughly doubled.

If you do the math based on the estimates above, this means the total market value of global AI (beyond generative AI) could be in the tens of trillions of dollars. In comparison, the total value of all cryptocurrencies today (including Bitcoin and all altcoins) is only around $2.7 trillion. So, let’s face it: the vast majority of customers who need AI in the short term will be Web2 customers, as Web3 customers who actually need AI will only account for a small fraction of that $2.7 trillion (given that BTC is this market, Bitcoin itself does not need/use AI).

Web3 AI use cases are just beginning, and it is not clear how big the market will be. But one thing is certain - it will only account for a small portion of the Web2 market for the foreseeable future. We believe that Web3 AI still has a bright future, but this only means that the most powerful application of Web3 AI at present is to serve Web2 customers.

Hypothetically examples of Web2 clients that could benefit from Web3 AI include:

  • Building a vertical-specific AI-centric software company from the ground up (e.g. Cedar.ai or Observe.ai)

  • Large enterprises that fine-tune models for their own purposes (e.g. Netflix)

  • Fast-growing AI providers (e.g. Anthropic)

  • Software companies that incorporate AI into existing products (e.g. Canva)

This is a relatively stable customer persona because the customers are generally large and valuable. They are unlikely to go out of business quickly, and they represent a large potential customer base for AI services. Web3 AI services that serve Web2 customers will benefit from this stable customer base.

But why would a Web2 client want to use the Web3 stack? The rest of this post explains the case.

Real Benefit #2: Lower GPU Costs with GPU DePIN

GPU DePIN aggregates underutilized GPU computing power (the most reliable of which comes from data centers) and makes it available for AI inference. An easy way to analogize this problem is "Airbnb for GPUs."

The reason we are excited about GPU DePIN is that, as mentioned above, there is a shortage of NVIDIA chips, and there are currently wasted GPU cycles that could be used for AI inference. These hardware owners have sunk costs and are currently not fully utilizing the equipment, so these partial GPUs can be made available at a much lower cost compared to the status quo, as this effectively "finds money" for the hardware owners.

Examples include:

  • AWS machines. If you were to lease an H100 from AWS today, you would have to commit to a 1-year lease because of limited supply in the market. This is wasteful because you may not use the GPU 7 days a week, 365 days a year.

  • Filecoin mining hardware. Filecoin has a large subsidized supply but no real demand. Filecoin never found true product-market fit, so Filecoin miners are at risk of going out of business. These machines are equipped with GPUs and can be repurposed for low-end AI inference tasks.

  • ETH mining hardware. When Ethereum transitioned from PoW to PoS, this quickly freed up a lot of hardware that could be repurposed for AI inference.

Note that not all GPU hardware is suitable for AI inference. One obvious reason for this is that older GPUs do not have the amount of GPU memory required for LLMs, although there have been some interesting innovations that can help in this regard. For example, Exabits' technology can load active neurons into GPU memory and inactive neurons into CPU memory. They predict which neurons need to be active/inactive. This allows low-end GPUs to handle AI workloads even if the GPU memory is limited. This effectively makes low-end GPUs more useful for AI inference.

Web3 AI DePINs will need to evolve their products over time and provide enterprise-grade services such as single sign-on, SOC 2 compliance, service-level agreements (SLAs), etc. This is similar to what current cloud service providers provide to Web2 customers.

Real benefit #3: Censorship-resistant models to prevent OpenAI from self-censoring

There is a lot of discussion about censorship of AI. For example, Turkey temporarily banned OpenAI (they later reversed course when OpenAI improved compliance). We think that national-level censorship is boring because countries need to adopt AI to stay competitive.

OpenAI will also self-censor. For example, OpenAI will not process NSFW content. OpenAI will not predict the next presidential election. We think AI use cases are not only interesting, but also have a huge market, but OpenAI will not touch that market for political reasons.

Open source is a great solution because Github repositories are not influenced by shareholders or boards. Venice.ai is an example of this, which promises to preserve privacy and operate in a censorship-resistant manner. Web3 AI can effectively take this to the next level by powering these open source software (OSS) models on lower-cost GPU clusters to perform inference. It is for these reasons that we believe OSS + Web3 is an ideal combination to pave the way for censorship-resistant AI.

Real benefit #4: Avoid sending personally identifiable information to OpenAI

Large enterprises have privacy concerns about their internal data. It may be difficult for these customers to trust OpenAI as a third party with this data.

In Web3, it may seem even more worrisome (on the surface) for these businesses to have their internal data suddenly appear on a decentralized network. However, there are innovations in privacy-enhancing technologies for AI:

Trusted Execution Environment (TEE), such as Super Protocol

Fully homomorphic encryption (FHE), such as Fhenix.io (a portfolio company of funds managed by Hack VC) or Inco Network (both powered by Zama.ai), and Bagel’s PPML

These technologies are still evolving, and performance is still improving with upcoming zero-knowledge (ZK) and FHE ASICs. But the long-term goal is to protect enterprise data while fine-tuning models. As these protocols emerge, Web3 may become a more attractive venue for privacy-preserving AI computation.

Real Benefit #5: Leverage the latest innovations in open source models

Open source software has been eating into the market share of proprietary software for the past few decades. We view LLM as a form of proprietary software that is strong enough to disrupt OSS. Notable examples of challengers include Llama, RWKV, and Mistral.ai. This list will undoubtedly grow over time (a more comprehensive list can be found at Openrouter.ai). By leveraging Web3 AI (powered by the OSS model), people can innovate with these new innovations.

We believe that over time, an open source global development community combined with cryptocurrency incentives can drive rapid innovation in the open source model and the agents and frameworks built on top of it. An example of an AI agent protocol is Theoriq. Theoriq leverages the OSS model to create a composable interconnected network of AI agents that can be assembled to create higher-level AI solutions.

The reason we are confident about this is because in the past, most "developer software" innovations have slowly been surpassed by OSS over time. Microsoft used to be a proprietary software company, and now they are the#1company contributing to Github. There is a reason for this, and if you look at how Databricks, PostGresSQL, MongoDB, and others have disrupted proprietary databases, that's an example of OSS disrupting an entire industry, so the precedent here is very compelling.

However, there is a catch. One of the tricky things about open source large language models (OSS LLMs) is that OpenAI has started signing paid data licensing agreements with some organizations, such as Reddit and The New York Times. If this trend continues, OSS LLMs may become more difficult to compete due to the financial barriers to obtaining data. Nvidia may further increase its investment in confidential computing as a boost to secure data sharing. Time will tell how this develops.

Real Benefit #6: Consensus via high-cost random sampling or via ZK proofs

One of the challenges of Web3 AI inference is verification. Given that validators have the opportunity to cheat on their results to earn fees, verifying inferences is an important measure. Note that this cheating has not actually happened yet because AI inference is still in its infancy, but it is inevitable unless measures are taken to curb this behavior.

The standard Web3 approach is to have multiple validators repeat the same operation and compare the results. As mentioned earlier, the outstanding challenge facing this problem is that AI inference is very expensive due to the current shortage of high-end Nvidia chips. Given that Web3 can provide lower-cost inference through underutilized GPU DePIN, redundant computation will severely weaken the value proposition of Web3.

A more promising solution is to perform ZK proofs for off-chain AI inference computations. In this case, a succinct ZK proof can be verified to determine whether the model was trained correctly, or whether inference was run correctly (called zkML). Examples include Modulus Labs and ZKonduit. Since ZK operations are computationally intensive, the performance of these solutions is still in its infancy. However, we expect the situation to improve as ZK hardware ASICs are released in the near future.

What is more promising is a somewhat "optimistic" sampling-based approach to AI reasoning. In this model, you only need to verify a small fraction of the results generated by the validator, but set the economic cost of slashing it high enough so that if it is discovered, there will be a strong economic disincentive for the validator to cheat. In this way, you can save redundant computation.

Another promising idea is watermarking and fingerprinting solutions, such as the one proposed by Bagel Network. This is similar to the mechanism Amazon Alexa uses to provide on-device AI model quality assurance for its millions of devices.

Real Benefit #7: Savings via OSS (OpenAI’s Profits)

The next opportunity Web3 brings to AI is cost democratization. So far, we’ve discussed saving GPU costs through DePIN. But Web3 also offers the opportunity to save profit margins on centralized Web2 AI services (such as OpenAI, which has over $1 billion in annual revenue as of this writing). These cost savings come from the fact that using OSS models instead of proprietary models enables additional savings because the model creators are not trying to make a profit.

Many OSS models will remain completely free, giving customers the best economics. But there may be some OSS models that are also trying these monetization methods. Consider that only 4% of all models on Hugging Face are trained by companies that have the budget to help subsidize the models. The remaining 96% of models are trained by the community. This group (96% of Hugging Face) has underlying real costs (both compute costs and data costs). Therefore, these models will need to be monetized in some way.

There are a number of proposals to monetize open source software models. One of the most interesting is the concept of an “initial model offering,” where the model itself is tokenized, with a portion of the tokens reserved for the team and some of the model’s future revenue flowing to token holders, although there are certainly some legal and regulatory hurdles to this.

Other OSS models will attempt to monetize through usage. Note that if this becomes a reality, OSS models may start to resemble their Web2 monetization models more and more. But in reality, the market will be split in two, with some models remaining completely free.

Real Benefit #8: Decentralized Data Sources

One of the biggest challenges in AI is finding the right data to train models. We mentioned earlier that decentralized AI training has its challenges. But what about using the decentralized web to get data (which can then be used for training elsewhere, even in traditional Web2 venues)?

This is exactly what startups like Grass are doing. Grass is a decentralized network of “data scrapers” who donate their machines’ idle processing power to data feeds that inform the training of AI models. Hypothetically, at scale, this data feed can outperform any one company’s in-house data feed efforts due to the power of a large network of incentivized nodes. This includes not only getting more data, but getting it more frequently so that it’s more relevant and up-to-date. It’s virtually impossible to stop a decentralized army of data scrapers, since they are inherently decentralized and don’t reside within a single IP address. They also have a network that can clean and standardize the data so that it’s useful once it’s been scraped.

Once you have the data, you also need a place to store it on-chain, as well as the LLMs generated using that data.

Note that the role of data in Web3 AI may change in the future. Today, the status quo of LLMs is to pre-train models using data and refine them over time with more data. However, since data on the Internet changes in real time, these models are always a little out of date. Therefore, the responses inferred by LLMs are slightly inaccurate.

A future direction of development may be a new paradigm - "real-time" data. The concept is that when a large language model (LLM) is asked a reasoning question, the LLM can transmit and inject data through prompts, and this data is collected from the Internet in real time. In this way, the LLM can use the latest data. Grass is studying this part.

Special thanks to the following people for their feedback and help on this article: Albert Castellana, Jasper Zhang, Vassilis Tziokas, Bidhan Roy, Rezo, Vincent Weisser, Shashank Yadav, Ali Husain, Nukri Basharuli, Emad Mostaque, David Minarsch, Tommy Shaughnessy, Michael Heinrich, Keccak Wong, Marc Weinstein, Phillip Bonello, Jeff Amico, Ejaaz Ahamadeen, Evan Feng, JW Wang.