Article source: PA Recommended Reading
Author: jolestar
Last week, I tinkered with AI Agents and attended an ai16z event in Beijing the day before yesterday, wanting to see what AI Agents can actually do now and to think about what they can do in the future.
The current state of AI Agents reminds me of that meme, where a person is hidden inside a vending machine. The AI Agents that people have imagined are starting to have a sense of self-awareness, but in reality, there's actually a developer hidden inside the AI Agent. (Here, I let everyone visualize the scene; I tried to get AI to generate this picture, but found that AI couldn't understand 'hidden.')
The basic working method of the AI Agent framework
The AI Agent framework currently plays the role of an adhesive, binding clients (like Twitter, Discord, Telegram, etc.) and various plugins (across chains, etc.), while the framework provides a basic library (memory storage, session isolation, context generation, etc.) and later connects to various AI platform interfaces.
How the AI Agent framework integrates with applications and business scenarios
Since the explosion of AI last year, various platforms and tools have emerged, and the key issue to solve is how AI integrates with applications. Some AI platforms attempt to provide plugins, some create workflow models, and some traditional applications embed AI within them. But the key questions are: 1. Where is the interaction entrance for applications? 2. How does AI integrate with existing business logic?
The interaction entrance provided to users by various AI platforms is a chat window-like dialog box. Clearly, everyone thinks that the interaction method with AI applications should be a 'human-like' approach. What makes AI Agents smart in this regard is that they directly connect to all open IM and social systems, which is obviously more acceptable than creating a new way.
How AI integrates with existing business logic. The solution provided by AI Agents is to allow developers to incorporate AI decision-making into business scenarios. Programming languages require determinism; the condition of 'if' can only be true or false and cannot handle ambiguous business logic. AI can convert complex logic into precise conditions, allowing seamless integration into business scenarios.
For example, the feature of replying to messages in a group; traditional IM Bots require specific message commands to trigger a response, while with AI, we can implement a method called shouldReplyMessage, provide it with context, and it returns true or false.
The role of AI in business logic scenarios mainly is:
1. 'Intent' discovery: By using descriptions in the prompts, let AI discover the 'intent' in user text messages based on context and map that intent to specific code.
2. Assisting decision-making: Converting vague complex conditions into definite true/false or enumerated types through AI, and then integrating them into business logic.
Upon reaching this point, many might feel disappointed with AI Agents. Many think that an AI Agent is simply taught a bit, and it can do everything. In reality, due to the context limitations of large models, it is impossible (at least currently) to create a universal AI that can do anything. But the good news is that programmers do not need to worry about unemployment; AI still requires a large number of programmers behind it, and someone still needs to pile up if-else statements. The key difference is that the boundaries of business processes that can be handled by programs are expanding.
Two types of AI Agents
At the event, I asked Shaw a question: the market has two expectations for AI Agents. 1. The AI Agent plays a role, has its own ID and brand, and provides services to users. 2. Users have a personal AI Agent, akin to a personal assistant, which can help users handle some business tasks. Which of these two types of AI Agents will be more popular? He thinks both directions will be good, and they may even combine.
Currently, the main direction being explored in the market is still the first one. This direction is similar to service AI Agentification; in the future, there may be no App interface anymore, as all Apps will be AI Agentified and personified. The second direction is the agentification of application clients, where future application clients will be a plugin for assistant Agents, and local application data will become part of the Agent's memory bank, while this plugin is also responsible for communicating with the cloud service Agent. This represents a new application architecture model that will change the entire infrastructure.
Requirements of AI Agents for infrastructure
1. Infrastructure must achieve permissionless access; otherwise, AI Agents will be restricted by various anti-attack strategies. Services should use economic costs (Gas) to prevent attacks. Platforms with lower levels of openness will face significant impacts, and the early enthusiasm for open platforms in Web2 will be reignited.
2. AI Agents need to be able to operate funds for payment to solve the above problems.
In other words, future services, whether based on blockchain or not, will need to support Crypto's private key model for identity verification and Crypto-based payments.
The integration of AI Agents and chains
In addition to the two points mentioned above, how AI Agents integrate with chains is a direction everyone is exploring. At the event, I talked with Mikkke about the focEliza he is working on. The two types of AI Agents mentioned earlier, at least the first type, require a runtime or verification environment provided by the chain. Because once an AI Agent offers services externally, there will be trust issues, and its role is essentially the same as that of a smart contract.
There was a controversy years ago about the name 'smart contract'; it is just a piece of code, where's the 'smart'? AI can make smart contracts live up to their name. The challenge is how to call AI interfaces within the smart contract environment. If running large models in a verifiable environment still feels far off, using an Oracle-like solution is a more practical path.
Moreover, many demands will arise around AI Agents. How can public knowledge for AI Agents be obtained? How can AI Agents determine facts? How can AI Agents recognize the same user across different platforms? How is 'memory' stored within smart contracts? If I have multiple devices, each equipped with an AI Agent, how do they share memory?
You will find that what was done in Web3, such as 'data on-chain', relationship on-chain, DID, P2P networks, etc., all have new meanings and scenarios.
Conclusion
Reusing my conclusion from a 2021 sharing about AI and blockchain, a more internet-friendly AI is also a more human-friendly internet. Back then, it was just a thought, but now the future has arrived.