Author: jolestar
Last week, I dabbled with AI Agents and attended an ai16z event in Beijing the day before yesterday, wanting to see what AI Agents can actually do now and reflect on what they could do in the future.
The current state of AI Agents reminds me of that meme where a person is hidden inside a vending machine. The AI Agents we have imagined seem to have developed self-awareness, but in reality, what hides inside the AI Agents is a developer. (Here, everyone can visualize the scene; I tried to let AI generate this image but found that AI could not understand 'hidden')
Basic working method of the AI Agent framework
The AI Agent framework currently acts as an adhesive, connecting clients (like Twitter, Discord, Telegram, etc.) with various plugins (across different blockchains), while the framework provides a basic library (memory storage, session isolation, context generation, etc.), which later interfaces with various AI platform APIs.
How the AI Agent framework integrates with applications and business scenarios
Since the AI boom last year, various platforms and tools have emerged, and the key issue is how AI can integrate with applications. Some AI platforms attempt to offer plugins, while others build workflow models, and traditional applications embed AI within them. The critical questions here are: 1. Where is the interaction entry point of the application? 2. How does AI integrate with existing business logic?
The interaction entry point provided by various AI platforms for users is a dialog box similar to a chat window, and it is clear that everyone thinks the interaction with AI applications should be 'humanized.' The smart aspect of AI Agents is that they directly connect to all open IM and social systems, which is evidently more acceptable than creating a new one.
How AI can integrate with existing business logic. The solution provided by AI Agents allows developers to incorporate AI decision-making into business scenarios. Programming languages require determinism, where the condition in an if statement can only be true or false, making it difficult to handle ambiguous business logic. However, through AI, complex logic can be transformed into precise conditions, which can then be seamlessly integrated into business scenarios.
For instance, the feature of replying to messages in a group; traditional IM Bots need specific message commands to trigger replies, whereas with AI, we can implement a method shouldReplyMessage that, given context, returns true or false.
The role of AI in business logic scenarios is mainly:
1. 'Intent' discovery: By using descriptions in prompts, let AI identify the 'intent' in user text messages based on context and map this intent to specific code.
2. Assisting decision-making: AI can convert vague, complex conditions into definitive true/false or enumeration types and then integrate them into business logic.
Seeing this, many people may be disappointed with AI Agents; many believe that AI Agents are just about teaching AI, and it will know everything. In reality, due to the contextual limitations of large models, it is impossible (at least currently) to create a universal AI that can do anything. But the good news is that programmers do not need to worry about unemployment; AI still requires a lot of hidden programmers, and someone needs to manage if-else statements, but the key difference is that the boundaries of business processes that programs can handle are expanding.
Two types of AI Agents
At the event, I asked Shaw a question: the market has two expectations for AI Agents, 1. AI Agents act as a role, possessing their own ID and brand, providing services to users. 2. Users have personal AI Agents, equivalent to personal assistants, assisting them in handling certain tasks. Which of these two types of AI Agents would be more popular? He believes both directions will do well and may even combine.
Currently, the main exploration in the market is still the first direction. This direction is similar to service AI Agentification; in the future, there may be no app interfaces, as all apps will be AI Agents, anthropomorphized. The second direction is the agentification of application clients, where future application clients will be a plugin for assistant Agents, and local application data will become part of the Agent's memory, while this plugin also communicates with cloud service Agents. This represents a new application architecture model that will change the entire infrastructure.
Requirements of AI Agents for infrastructure
1. Infrastructure must achieve no entry barriers (Permissionless); otherwise, AI Agents will be restricted by various attack prevention strategies, and services should defend against attacks using economic costs (Gas). In this regard, platforms with poor openness will face significant impacts, reigniting the enthusiasm for open platforms that we saw in the early days of Web2.
2. AI Agents need to be able to handle funds to pay, addressing the issues mentioned above.
In other words, future services, whether based on blockchain or not, will need to support identity verification using a crypto private key model and crypto-based payments.
The combination of AI Agent and blockchain
In addition to the two points mentioned above, how AI Agents can integrate with blockchain is a direction being explored. At the event, I chatted with Mikkke about what he is working on, focEliza. The two types of AI Agents mentioned earlier, at least the first one, require a runtime or verification environment provided by the blockchain. Because once an AI Agent offers services to the outside, trust issues arise, and its role is essentially similar to that of a smart contract.
There was a controversy about the name 'smart contract'; it is just a piece of code, where is the 'intelligence'? AI can make smart contracts live up to their name. The challenge is how to call AI interfaces within the smart contract environment. If running large models in a verifiable environment is still a distant goal, using an Oracle-like solution is a more feasible path.
Around AI Agents, many needs will emerge. How to acquire public knowledge for AI Agents? How do AI Agents determine facts? How do AI Agents identify the same user across different platforms? How is 'memory' stored in smart contracts? If I have multiple devices, each with an AI Agent, how do they share memory?
You will find that concepts like 'data on-chain', 'relationships on-chain', 'DID', 'P2P networks', etc., that were explored in Web3 have new meanings and scenarios.
Conclusion
Reusing my conclusion from a 2021 talk about AI and blockchain, a more AI-friendly internet is also a more human-friendly internet. At that time, it was just an idea, but now the future has arrived.