Author: jolestar
Last week, I fiddled with AI Agents and attended an ai16z event in Beijing the day before yesterday to see what AI Agents can actually do now and to think about what they can do in the future.
The current state of AI Agents reminds me of that meme where a person is hidden inside a vending machine. People have already imagined AI Agents as having autonomous consciousness, but the reality is that there’s actually a developer hidden inside the AI Agent. (Here everyone can visualize, I tried to get AI to generate this image, but found AI couldn't understand 'hidden')
The basic working mechanism of the AI Agent framework
The AI Agent framework currently acts as a binder, connecting clients (Twitter, Discord, Telegram, etc.) with various plugins (various chains, etc.), while the framework provides a basic library (memory storage, session isolation, context generation, etc.) for connecting with various AI platform interfaces later.
How the AI Agent framework integrates with applications and business scenarios
Since AI became popular last year, various platforms and tools have emerged. The key is to solve one problem: how AI integrates with applications. Some AI platforms attempt to provide plugins, some build workflow models, and some traditional applications embed AI within them. But the key questions here are: 1. Where is the interaction entry point for applications? 2. How does AI integrate with existing business logic?
All AI platforms provide users with a chat window-like interaction entry point, clearly indicating that everyone believes the interaction with AI applications should be a 'personified' way. In this regard, AI Agents are clever in that they directly connect to all open IM and social systems, making them clearly easier to accept than creating something new.
How AI integrates with existing business logic. The solution provided by AI Agents is to allow developers to incorporate AI decisions into business scenarios. Programming languages require determinism; the conditions of if statements can only be true or false, unable to handle ambiguous business logic. AI can convert complex logic into precise conditions, seamlessly integrating into business scenarios.
For example, the function of replying to messages in groups, traditional IM Bots need to be triggered by some explicit message commands, while through AI, a method called shouldReplyMessage can be implemented, providing it with context, and it returns true or false.
The role of AI in business logic scenarios is mainly:
1. 'Intent' discovery: By using explanations in prompts, allowing AI to discover the 'intent' in user text messages based on context, and mapping the intent to specific code.
2. Decision support: By using AI to convert vague and complex conditions into definitive true/false or enumerated types, and then integrating them into business logic.
After reading this, many people might be disappointed with AI Agents, as many believe that AI Agents can do everything just by teaching them a bit. In reality, due to the context limitation of large models, it is impossible (at least currently) to create a universal AI that can do anything. But the good news is that programmers don't need to worry about unemployment, as AI still needs a lot of programmers behind the scenes, and someone still needs to write if-else statements. The key difference is that the boundaries of business that programs can handle are expanding.
Two types of AI Agents
At the event, I asked Shaw a question: the market has two expectations for AI Agents: 1. AI Agents play a role, have their own ID, brand, and provide services to users. 2. Users have personal AI Agents, equivalent to personal assistants, that can help users handle some business. Which of these two types of AI Agents would be more popular? He believes both directions will do well and may even combine.
Currently, the main exploration in the market is still the first direction. This direction is similar to service AI agentification; in the future, there may be no app interfaces, as all apps will be AI agents and personified. The second direction is the agentification of application clients, where future application clients will be a plugin for assistant agents, turning local application data into part of the agent's memory, and this plugin will also be responsible for communicating with cloud service agents. This represents a new application architecture model that will change the entire infrastructure.
Requirements of AI Agents on infrastructure
1. The infrastructure must achieve permissionless access; otherwise, AI Agents will be restricted by various anti-attack strategies. Services should defend against attacks in an economically viable way (Gas). Platforms with lower levels of openness will face significant challenges, and the enthusiasm for open platforms reminiscent of the early days of Web2 will be reignited.
2. AI Agents need to be able to operate funds to pay, in order to address the above issues.
That is to say, future services, whether based on blockchain or not, need to support identity verification through the Crypto private key model and Crypto-based payments.
The combination of AI Agents and blockchains
In addition to the two points mentioned above, how AI integrates with blockchains is a direction everyone is exploring. At the event, I chatted with Mikkke about the focEliza he is working on. The two types of AI Agents mentioned earlier, at least the first type, require a running or verification environment provided by the blockchain. Because once an AI Agent offers services externally, there will be trust issues; its role is essentially the same as that of a smart contract.
The name 'smart contract' had a controversy back then; it's just a piece of code, where's the 'intelligence'? AI can make smart contracts live up to their name. The challenge is how to call AI interfaces in the smart contract environment. While running large models in a verifiable environment is still a distant goal, using solutions similar to Oracles is a more practical path.
Moreover, many demands will arise around AI Agents. How can we obtain common knowledge for AI Agents? How do AI Agents determine facts? How do AI Agents recognize the same user on different platforms? How is 'memory' stored in smart contracts? If I have multiple devices, each equipped with an AI Agent, how do they share memories?
You will find that things like 'data on-chain', relationship on-chain, DID, P2P networks, etc., that were explored in Web3 now have new meanings and scenarios.
Conclusion
Reusing my conclusion from a 2021 sharing on AI and blockchain, a more AI-friendly internet is also a more human-friendly internet. Back then, it was just a brainwave, but now the future has arrived.