Original author | @cebillhsu

Compilation | Golem

The progress of AI technologies such as GPT-4, Gemini 1.5, and Microsoft AI PC is impressive, but the current development of AI still faces some problems. Bill, a Web3 researcher at AppWorks, has studied these problems in depth and explored seven directions for how Crypto can empower AI.

Data Tokenization

Traditional AI training mainly relies on public data available on the Internet, or more accurately, traffic data in the public domain. Except for a few companies that provide open APIs, most data remains untapped. How to enable more data holders to contribute or authorize their data for AI training while ensuring privacy is protected is a key direction.

However, the biggest challenge facing this space is that data is difficult to standardize like computing power. While distributed computing power can be quantified by GPU type, the quantity, quality, and usage of private data are difficult to measure. If distributed computing power is like ERC 20, then the tokenization of datasets is similar to ERC 721, which makes liquidity and market formation more challenging than ERC 20.

Ocean Protocol’s Compute-to-Data feature allows data owners to sell private data while preserving privacy. Vana provides a way for Reddit users to aggregate their data and sell it to companies that train large AI models.

Resource allocation

Currently, there is a large gap between supply and demand of GPU computing power, and large companies monopolize most GPU resources, which makes it very expensive for small companies to train models. Many teams are working hard to reduce costs by centralizing small-scale, low-utilization GPU resources through decentralized networks, but they still face great challenges in ensuring stable computing power and sufficient bandwidth.

Incentive RLHF

RLHF (Reinforcement Learning with Human Feedback) is essential for improving large models, but this requires professionals to train them. As market competition intensifies, the cost of hiring these professionals is also increasing. In order to reduce costs while maintaining high-quality annotations, a pledge and forfeiture system can be used. One of the biggest expenses in data annotation is the need for supervisors to check the quality. However, over the years, blockchains have successfully used economic incentive mechanisms to ensure the quality of work (PoW, PoS), and I believe that creating a good token economic system can effectively reduce the cost of RLHF.

For example, Sapien AI has introduced Tag 2 Earn and cooperated with several gamefi guilds; Hivemapper has 2 million kilometers of road training data through a token incentive mechanism; QuillAudits plans to launch an open source smart contract audit agent, allowing all auditors to jointly train agents and receive rewards.

Verifiability

How to verify whether the computing power provider performs reasoning tasks according to specific requirements or models? Users cannot verify the authenticity and accuracy of AI models and their outputs. This lack of verifiability may lead to distrust, errors and even damage to interests in fields such as finance, medicine and law.

By using cryptographic verification systems such as ZKP, OP, and TEE, inference service providers can prove that the output is executed by a specific model. The benefits of using cryptographic verification include that the model provider can maintain the confidentiality of the model, the user can verify that the model is executed correctly, and the encryption of the proof into the smart contract can circumvent the computing power limitations of the blockchain. At the same time, it is also possible to consider running AI directly on the device to solve performance problems, but so far no satisfactory answer has been seen. Projects building in this area include Ritual, ORA, and Aizel Network.

Deep fakes

With the advent of production AI, people are paying more and more attention to the problem of deepfakes. However, the advancement of deepfake technology is faster than the detection technology, so it is becoming more and more difficult to detect deepfakes. Although digital watermarking technologies (such as C 2 PA) can help identify deepfakes, they also have limitations because the processed image has been modified and the public cannot verify the signature on the original image. Verification will become very difficult only through the processed image.

Blockchain technology can address the deepfake problem in a number of ways. Hardware authentication can use a tamper-proof chip camera to embed a cryptographic proof in each original photo to verify the authenticity of the image. The immutability of blockchain allows images with metadata to be added to a timestamped block to prevent tampering and verify the original source. In addition, wallets can be used to attach cryptographic signatures to published posts to verify the authorship of the published content, and the KYC infrastructure based on zk technology can bind wallets to verified identities while protecting user privacy. From an economic incentive perspective, authors should be punished for publishing false information, and users can be rewarded for identifying false information.

Numbers Protocol has been working in this space for years; Fox News’ verification tool is based on the Polygon blockchain, allowing users to find articles and retrieve relevant data from the blockchain.

privacy

When AI models are fed with sensitive information in areas such as finance, healthcare, and law, it is also extremely important to protect data privacy while using it. Homomorphic encryption (FHE) can process data without decrypting it, thereby protecting privacy when using the LLM model. The workflow is as follows:

  1. The user starts the inference process on the local device and stops after completing the initial layer. This initial layer is not included in the model shared with the server;

  2. The client encrypts the intermediate operation and forwards it to the server;

  3. The server processes the encrypted data using a partial attention mechanism and sends the result back to the client.

  4. The client decrypts the result and continues inference locally. In this way, FHE ensures that the privacy of user data is preserved throughout the entire processing.

Zama is building a fully homomorphic encryption (FHE) solution and has recently raised $73 million in funding to support development.

AI Agent

The idea of ​​AI agents is futuristic, what if AI agents could own assets and trade them? We might move away from using general-purpose large models to assist in decision making and toward assigning tasks to specialized agents.

These agents will collaborate with each other, and just as reasonable economic relationships can improve the collaborative ability of humans, adding economic relationships to AI agents can also improve their efficiency. Blockchain can be a testing ground for this concept. For example, Colony is experimenting with this idea through games, providing wallets for AI agents to trade with other agents or real players to achieve specific goals.

Conclusion

Most of the problems are actually related to open source AI. To ensure that such an important technology will not be monopolized by a few companies in the next decade, the token economy system can quickly utilize decentralized computing resources and training data sets, narrowing the resource gap between open source and closed source AI. Blockchain can track AI training and reasoning for better data governance, while encryption technology can ensure trust in the post-AI era and deal with deep fakes and privacy protection issues.

Related Reading

An article reviews the implementation direction and protocols of AI-enabled Crypto