Author: Jolestar
I tinkered with the AI Agent last week and attended the ai16z event in Beijing the day before yesterday to see what AI Agents can actually do now and think about what they can do in the future.
The current state of AI Agents reminds me of that meme where a person is hidden inside a vending machine. People have already imagined AI Agents as having autonomous consciousness, but in reality, there is a developer hidden inside the AI Agent. (Here, everyone can visualize the scene; I tried to have AI generate this image but found that AI could not understand 'hidden').
The basic working method of the AI Agent framework
The AI Agent framework currently acts as an adhesive, connecting clients (Twitter, Discord, Telegram, etc.) with various plugins (various blockchains, etc.), and then the framework provides a basic library (memory storage, session isolation, context generation, etc.) to interface with various AI platform APIs later.
How the AI Agent framework integrates with applications and business scenarios
Since the AI boom last year, various platforms and tools have emerged, and the key issue is how AI can integrate with applications. Some AI platforms attempt to provide plugins, some build workflow models, and some traditional applications embed AI internally. But the key questions are: 1. Where is the interaction entry point of the application? 2. How does AI integrate with existing business logic?
The interaction entry points provided by various AI platforms for users are all dialogue boxes similar to chat windows. Clearly, everyone believes that the interaction method with AI applications should be a 'human-like' way. The cleverness of the AI Agent lies in its direct integration with all open IM and social systems, which is obviously more acceptable than creating a new one.
How AI integrates with existing business logic. The solution provided by AI Agents is to allow developers to incorporate AI decision-making into business scenarios. Programming languages require determinism; the conditions of if can only be true or false and cannot handle ambiguous business logic. Through AI, complex logic can be transformed into precise conditions, which can then be seamlessly integrated into business scenarios.
For example, the feature of replying to messages in a group chat, traditional IM Bots need to be triggered by some explicit message commands, while AI can implement a method called shouldReplyMessage, which returns true or false when given context.
The role of AI in business logic scenarios mainly is:
1. 'Intent' discovery: By explaining in the prompts, allow AI to discover the 'intent' in the user's text messages based on context and map that intent to specific code.
2. Decision support: Using AI to convert vague complex conditions into definite true/false or enumeration types, and then incorporating them into business logic.
Seeing this, many people might be disappointed with AI Agents. Many believe that AI Agents can just be trained and they will understand everything. In reality, due to the contextual limitations of large models, it's impossible (at least currently) to create a universal AI that can do anything. But the good news is that programmers don’t have to worry about unemployment; AI still requires a lot of programmers behind the scenes, and someone needs to handle the if-else statements. The key difference is that the boundaries of business that programs can handle are expanding.
Two types of AI Agents
At the event, I asked Shaw a question about the market’s two expectations for AI Agents: 1. The AI Agent plays a role, has its own ID and brand, and provides services to users. 2. Users have personal AI Agents that serve as personal assistants to help manage some business. Which of these two types of AI Agents will be more popular? He believes both directions will be good and may even combine.
Currently, the main exploration direction in the market is still the first one. This direction is similar to service AI Agentification; in the future, there may be no App interface at all, as all Apps will become AI Agents and personified. The second direction is the Agentification of application clients; future application clients will be plugins for assistant Agents, and local application data will become part of the Agent's memory storage. This plugin will also be responsible for communicating with cloud service Agents. This represents a new application architecture model that will change the entire infrastructure.
Requirements of AI Agents for infrastructure
1. The infrastructure must achieve a permissionless entry threshold; otherwise, AI Agents will be restricted by various anti-attack strategies, and services should use economic costs (Gas) to prevent attacks. Platforms with a poor degree of openness will face significant impacts, and the enthusiasm for open platforms from the early days of Web2 will be reignited.
2. The AI Agent needs to be able to operate funds for payments to solve the problems mentioned above.
In other words, future services, whether based on blockchain or not, will need to support identity verification using Crypto's private key model and Crypto-based payments.
The integration of AI Agents and blockchains
Apart from the two points mentioned above, how the AI Agent integrates with the blockchain is a direction everyone is exploring. At the event, I talked to Mikkke about the focEliza he is working on. The two types of AI Agents mentioned earlier require a running or verification environment provided by the blockchain. Once an AI Agent provides services externally, there will be trust issues; its role is actually similar to that of a smart contract.
There was a controversy about the name 'smart contract' back in the day; it's just a piece of code, so where is the 'intelligence'? AI can make smart contracts live up to their name. The challenge is how to call AI interfaces within the smart contract environment. If running large models in a verifiable environment is still a distant goal, using solutions similar to Oracle is a more practical path.
There will be many demands arising around AI Agents. How to obtain public knowledge for AI Agents? How do AI Agents determine facts? How do AI Agents identify the same user across different platforms? How is 'memory' stored in smart contracts? If I have multiple devices, each with an AI Agent, how do they share memory?
You will find that the 'data on-chain', relationship on-chain, DID, P2P network, etc., that have been explored in Web3 have new meanings and scenarios.
Conclusion
Reusing my conclusion from a 2021 talk about AI and blockchain: an internet more friendly to AI is also an internet more friendly to humanity. Back then, it was just a thought experiment, but now the future has arrived.