Author: jolestar

Last week I fiddled with AI Agents, and the day before yesterday I attended an ai16z event in Beijing to see what AI Agents can actually do now and think about what they can do in the future.

The current state of AI Agents reminds me of that meme where a person is hidden inside a vending machine. The AI Agents we imagined have already started to possess self-awareness, but in reality, there is actually a developer hidden inside the AI Agents. (Here, everyone can imagine the scene; I tried to get AI to generate this image but found that AI cannot understand 'hidden')

Basic working methods of AI Agent framework

The AI Agent framework currently acts as a glue, binding clients (Twitter, Discord, Telegram, etc.) and various plugins (various blockchains, etc.), and then the framework provides a basic library (memory storage, session isolation, context generation, etc.) to interface with various AI platform APIs later.

How the AI Agent framework integrates with applications and business scenarios

Since AI became popular last year, various platforms and tools have emerged, but the key issue to solve is how AI can combine with applications. Some AI platforms attempt to provide plugins, some create workflow models, and some traditional applications embed AI within them. But the key questions are: 1. Where is the interaction entry point of the application? 2. How does AI integrate with existing business logic?

All AI platforms provide users with interaction entry points resembling chat window dialogues. Clearly, everyone believes that the interaction with AI applications should be anthropomorphic. In this regard, the cleverness of AI Agents lies in their direct integration with all open IM and social systems, which is evidently more acceptable than creating a new one.

How AI can integrate with existing business logic. The solution provided by AI Agents is to allow developers to integrate AI decision-making into business scenarios. Programming languages require determinism, and the conditions of 'if' can only be true or false, unable to handle ambiguous business logic. However, through AI, complex logic can be transformed into precise conditions, which can then be seamlessly integrated into business scenarios.

For example, the function of replying to messages in a group needs to be triggered by some explicit message commands in traditional IM Bots, while AI can implement a method called shouldReplyMessage, providing it with context, and it returns true or false.

The role of AI in business logic scenarios mainly is:

1. 'Intent' discovery: By explaining the prompts, allowing AI to discover the 'intent' in user text messages based on context and mapping that intent to specific code.

2. Assisting decision-making: Using AI to convert vague complex conditions into definite true/false or enumerated types and then integrating them into business logic.

Seeing this, many people might be disappointed with AI Agents. Many people think that an AI Agent is just a matter of teaching AI, and it will do everything. In reality, due to the contextual limitations of large models, it is impossible (at least currently) to create an all-purpose AI that can do anything. But the good news is that programmers don't need to worry about unemployment; AI still needs to have a large number of programmers hidden behind it, and someone needs to pile up if else statements. The key difference is that the boundaries of business processes that programs can handle are expanding.

Two types of AI Agents

At the event, I asked Shaw a question: the market has two expectations for AI Agents. 1. AI Agents play a role on their own, have their own ID and brand, and provide services to users. 2. Users have personal AI Agents, equivalent to personal assistants, which can help users handle some business. Which of these two types of AI Agents will be more popular? He believes both directions will be good and may even combine.

At present, the main exploration in the market is still the first direction. This direction is similar to service AI Agentization. In the future, there may be no App interface; all Apps will be AI Agentized and anthropomorphized. The second direction is the Agentization of application clients, where future application clients will be plugins of assistant Agents, and local application data will become part of the Agent's memory, while this plugin will also be responsible for communicating with cloud service Agents. This is a new application architecture model that will change the entire infrastructure.

Requirements of AI Agents for infrastructure

1. Infrastructure must achieve a permissionless threshold, otherwise AI Agents will be restricted by various anti-attack strategies. Services should use economic costs (Gas) to defend against attacks. Platforms with relatively low openness will face significant impacts, and the enthusiasm for open platforms seen during the early days of Web2 will be reignited.

2. AI Agents need to be able to operate funds for payments to solve the above issues.

In other words, future services, whether based on blockchain or not, will need to support identity verification through the private key model of Crypto and payments based on Crypto.

AI Agent and blockchain integration

In addition to the two points mentioned above, how AI Agents integrate with blockchain is an area everyone is exploring. At the event, I talked to Mikkke about the focEliza he is working on. The two types of AI Agents mentioned earlier, at least the first one, require a runtime or verification environment provided by the blockchain. Because once an AI Agent provides external services, there will be trust issues, and the role it plays is actually similar to that of a smart contract.

There was a controversy over the name 'smart contract'; it's just a piece of code, where is the 'intelligence'? AI can make smart contracts live up to their name. The challenge is how to call AI interfaces in the smart contract environment. If running a large model in a verifiable environment is still a long way off, using a solution similar to Oracle is a more feasible path.

Many demands will arise around AI Agents. How can AI Agents obtain public knowledge? How do AI Agents determine facts? How do AI Agents recognize the same user across different platforms? How is 'memory' stored in smart contracts? If I have multiple devices, each with an AI Agent, how do they share memories?

You will find that the 'data on chain', relationship on chain, DID, P2P networks, etc., that were explored in Web3, all have new meanings and scenarios.

Conclusion

Reusing my conclusion from a sharing session on AI and blockchain in 2021: an internet more friendly to AI is also an internet more friendly to humanity. Back then, it was just an idea, but now the future has arrived.