Simplifying the Agent construction process and providing some complex function combinations will still gain an advantage in the future, forming a Web3 creative economy that is more interesting than the GPT Store.

  • Author: YBB Capital Researcher Zeke

Preface

In previous articles, we have repeatedly explored views on the current state of AI Memes and the future development of AI Agents. However, the rapid narrative development and dramatic evolution of the AI Agent track can still be overwhelming. In just two short months since the 'Truth Terminal' opened the Agent Summer, the narrative combining AI and Crypto has changed almost weekly. Recently, the market's attention has shifted back to 'framework-type' projects dominated by technological narratives, and this niche track has already produced several dark horses with market values exceeding hundreds of millions or even billions in just the past few weeks. Such projects have also given rise to a new asset issuance paradigm, where projects issue tokens based on GitHub repositories, and Agents built on the framework can also issue tokens again. Based on the framework and on top of it with Agents. It resembles an asset issuance platform, but in fact, it is a unique infrastructure model emerging in the AI era. How should we view this new trend? This article will start with an introduction to the framework and combine personal thoughts to interpret what AI frameworks mean for Crypto.

One, What is a Framework?

By definition, an AI framework is a low-level development tool or platform that integrates a set of pre-built modules, libraries, and tools, simplifying the construction of complex AI models. These frameworks typically also include functionalities for data processing, model training, and predictions. In short, you can simply understand the framework as the operating system of the AI era, like Windows or Linux in desktop operating systems, or iOS and Android in mobile platforms. Each framework has its own advantages and disadvantages, and developers can freely choose based on specific needs.

Although the term 'AI framework' is still an emerging concept in the Crypto field, its development history has been nearly 14 years since the inception of Theano in 2010. In the traditional AI circles, whether in academia or industry, there are already very mature frameworks available for selection, such as Google's TensorFlow, Meta's Pytorch, Baidu's PaddlePaddle, and Byte's MagicAnimate, each with its own advantages for different scenarios.

The framework projects emerging in Crypto are built based on the substantial demand for Agents initiated by this wave of AI heat, and then further extend to other tracks in Crypto, ultimately forming AI frameworks in different niche fields. Let's take a few mainstream frameworks within the current circle as examples to elaborate on this statement.

1.1 Eliza

Taking ai16z's Eliza as an example, this framework is a multi-Agent simulation framework specifically designed for creating, deploying, and managing autonomous AI Agents. Developed using TypeScript as the programming language, its advantage is better compatibility and easier API integration.

According to official documents, Eliza primarily targets scenarios in social media, such as multi-platform integration support. This framework offers comprehensive Discord integration and supports voice channels, automated accounts on X/Twitter, integration with Telegram, and direct API access. In terms of media content processing, it supports reading and analyzing PDF files, extracting and summarizing link content, audio transcription, video content processing, image analysis and description, and dialogue summarization.

The current supported use cases of Eliza are mainly four categories:

  1. AI Assistant-type Applications: Customer support agents, community managers, personal assistants.

  2. Social Media Roles: Automatic content creators, interactive robots, brand representatives.

  3. Knowledge Workers: Research assistants, content analysts, document processors.

  4. Interactive Roles: Role-playing characters, educational counselors, entertainment robots.

Models currently supported by Eliza:

  1. Open-source model local inference: For example, Llama3, Qwen1.5, BERT.

  2. Using OpenAI's API for cloud inference.

  3. Default configuration is Nous Hermes Llama 3.1B.

  4. Integration with Claude for complex queries.

1.2 G.A.M.E

G.A.M.E (Generative Autonomous Multimodal Entities Framework) is a multi-modal AI framework for automated generation and management launched by Virtual, specifically designed for intelligent NPCs in games. This framework also has a unique feature that allows low-code or even no-code users to participate; based on its trial interface, users only need to modify parameters to engage in Agent design.

In terms of project architecture, the core design of G.A.M.E is a modular design that works through the collaboration of multiple subsystems, as detailed in the diagram below.

1. Agent Prompting Interface: The interface through which developers interact with the AI framework. Through this interface, developers can initialize a session and specify parameters such as session ID, agent ID, user ID, etc.

2. Perception Subsystem: The perception subsystem is responsible for receiving input messages, synthesizing them, and sending them to the strategic planning engine. It also handles responses from the dialogue processing module.

3. Strategic Planning Engine: The strategic planning engine is the core part of the entire architecture, divided into a High Level Planner and a Low Level Policy. The High Level Planner is responsible for setting long-term goals and plans, while the Low Level Policy translates these plans into specific action steps.

4. World Context: The world context contains data such as environmental information, world state, and game state, which helps the Agent understand the current situation.

5. Dialogue Processing Module: The dialogue processing module is responsible for handling messages and responses, producing dialogues or reactions as output.

6. On Chain Wallet Operator: The on-chain wallet operator may involve applications of blockchain technology, with specific functions being unclear.

7. Learning Module: The learning module learns from feedback and updates the Agent's knowledge base.

8. Working Memory: The working memory stores the Agent's recent actions, results, and current plans, among other short-term information.

9. Long Term Memory Processor: The long-term memory processor is responsible for extracting important information about Agents and their working memory and sorting it based on factors like importance, recency, and relevance.

10. Agent Repository: The agent repository stores attributes such as the agent's goals, reflections, experiences, and personality.

11. Action Planner: The action planner generates specific action plans based on the low-level policy.

12. Plan Executor: The plan executor is responsible for executing the action plans generated by the action planner.

Workflow: Developers activate the Agent through the Agent prompting interface, and the perception subsystem receives input and passes it to the strategic planning engine. The strategic planning engine utilizes the memory system, world context, and information in the Agent library to formulate and execute action plans. The learning module continuously monitors the Agent's action results and adjusts the Agent's behavior based on the results.

Application Scenarios: From the whole technical architecture perspective, this framework mainly focuses on the decision-making, feedback, perception, and personality of Agents in virtual environments. Besides gaming, it is also applicable to the Metaverse, and a large number of projects have adopted this framework for construction, as can be seen in the list below under Virtual.

1.3 Rig

Rig is an open-source tool written in Rust, designed to simplify the development of applications using large language models (LLM). It provides a unified operational interface that allows developers to easily interact with multiple LLM service providers (such as OpenAI and Anthropic) as well as various vector databases (like MongoDB and Neo4j).

Core Features:

  • Unified Interface: Regardless of which LLM provider or type of vector storage, Rig can provide a consistent access method, greatly reducing the complexity of integration work.

  • Modular Architecture: The internal framework adopts a modular design, including key components such as 'Provider Abstraction Layer', 'Vector Storage Interface', and 'Intelligent Agent System', ensuring the system's flexibility and scalability.

  • Type Safety: Achieved type-safe embedding operations using Rust, ensuring code quality and runtime safety.

  • Efficient Performance: Supports asynchronous programming modes, optimizing concurrent processing capabilities. Built-in logging and monitoring functions help with maintenance and troubleshooting.

Workflow: When a user requests to enter the Rig system, they will first go through the 'Provider Abstraction Layer', which standardizes the differences between different providers and ensures consistency in error handling. Next, in the core layer, the intelligent agent can call various tools or query vector storage to obtain the required information. Finally, through advanced mechanisms such as retrieval-augmented generation (RAG), the system can combine document retrieval and contextual understanding to produce accurate and meaningful responses, which are then returned to the user.

Application Scenarios: Rig is suitable not only for building question-answering systems that require quick and accurate responses but also for creating efficient document search tools, context-aware chatbots or virtual assistants, and even supporting content creation by automatically generating text or other forms of content based on existing data patterns.

1.4 ZerePy

ZerePy is an open-source framework based on Python designed to simplify the deployment and management of AI Agents on the X (formerly Twitter) platform. It evolved from the Zerebro project, inheriting its core functionalities but designed in a more modular and easily expandable way. Its goal is to enable developers to easily create personalized AI Agents and implement various automated tasks and content creation on X.

ZerePy provides a command-line interface (CLI), making it convenient for users to manage and control their deployed AI Agents. Its core architecture is based on a modular design, allowing developers to flexibly integrate different functional modules, such as:

  • LLM Integration: ZerePy supports large language models (LLM) from OpenAI and Anthropic, allowing developers to choose the model that best fits their application scenario. This enables Agents to generate high-quality text content.

  • X platform integration: The framework directly integrates X platform's API, allowing Agents to perform operations such as posting, replying, liking, and retweeting.

  • Modular Connection System: This system allows developers to easily add support for other social platforms or services, extending the framework's functionality.

  • Memory System (Future Planning): Although the current version may not have fully implemented this, ZerePy's design goal includes integrating a memory system that allows Agents to remember previous interactions and contextual information, thus generating more coherent and personalized content.

Although both ZerePy and a16z's Eliza project are committed to building and managing AI Agents, the two differ slightly in architecture and goals. Eliza focuses more on multi-agent simulations and broader AI research, while ZerePy emphasizes simplifying the process of deploying AI Agents on specific social platforms (X), leaning more towards practical applications.

Two, A Replica of the BTC Ecosystem

In terms of development paths, AI Agents and the BTC ecosystem at the end of 2023 and the beginning of 2024 have many similarities. The development path of the BTC ecosystem can be simply summarized as: BRC20—Atomical/Rune and other multi-protocol competition—BTC L2—BTCFi centered around Babylon. The AI Agent is developing more rapidly based on a mature traditional AI technology stack, but its overall development path indeed shares many similarities with the BTC ecosystem. I simplify it as follows: GOAT/ACT—Social-type Agents—Analytical AI Agent framework competition. From a trend perspective, infrastructure projects focusing on decentralizing and securing Agents are also likely to inherit this wave of framework heat, becoming the main theme of the next stage.

So, will this track lead to homogenization and bubble-like conditions like the BTC ecosystem? I actually think not. First, the narrative of AI Agents is not aimed at reproducing the history of smart contract chains. Secondly, the existing AI framework projects, whether they have real strength or are stagnating at the PPT stage or simply ctrl c + ctrl v, at least provide a new infrastructure development idea. Many articles compare AI frameworks to asset issuance platforms, and Agents to assets, but compared to Memecoin Launchpad and inscription protocols, I personally feel that AI frameworks resemble future public chains more, while Agents resemble future Dapps.

Currently, in Crypto, we have thousands of public chains and tens of thousands of Dapps. Among general chains, we have BTC, Ethereum, and various heterogeneous chains, while the forms of application chains are more diverse, such as gaming chains, storage chains, and Dex chains. Public chains correspond to AI frameworks, and both are actually very similar, while Dapps can correspond well to Agents.

In the Crypto of the AI era, it is highly likely to move towards this form. Future debates will shift from discussions of EVM and heterogeneous chains to framework disputes. The current question is more about how to decentralize or chainify? I think the subsequent AI infrastructure plans will unfold on this basis, and another point is what significance does doing this on the blockchain hold?

Three, The Significance of On-Chain?

No matter what blockchain combines with, it ultimately faces one question: Is it meaningful? In last year's article, I criticized the reversal of priorities in GameFi and the overly advanced development of Infra. In earlier articles about AI, I expressed skepticism about the combination of AI x Crypto in practical fields at this stage. After all, the driving force of the narrative for traditional projects has been weakening. The few traditional projects that performed well in terms of token prices last year basically had to possess the strength to match or exceed token prices. What can AI do for Crypto? What I previously thought about was Agents operating on behalf of achieving intentions, Metaverse, Agents as employees, etc., which are relatively mundane but in demand ideas. However, these needs do not require complete on-chain solutions, and from a business logic standpoint, they cannot form a closed loop. The Agent browser mentioned in the previous issue can achieve intentions but still lacks a tight integration, and the computing power aspect remains dominated by centralized computing.

Re-thinking the path to success of DeFi, the reason DeFi can carve out a piece of traditional finance is due to higher accessibility, better efficiency, lower costs, and the need for trustless centralized security. If we think along this line, I believe there may be several reasons to support the chaining of Agents.

1. Lowering Costs: Can the chaining of Agents achieve lower usage costs, thereby achieving higher accessibility and selectivity? Ultimately allowing ordinary users to participate in the 'rental rights' of AI exclusive to Web2 giants.

2. Security: According to the simplest definition of an Agent, an AI that can be called an Agent should be able to interact with the virtual or real world. If an Agent can intervene in reality or my virtual wallet, then a blockchain-based security solution is a necessity.

3. Blockchain-specific financial play: Can Agents implement a set of financial plays exclusive to blockchain? For example, LPs in AMMs, allowing ordinary people to participate in automated market making, such as Agents needing computing power, data labeling, etc., while users invest in the protocol in the form of U when they are optimistic. Or Agents based on different application scenarios can form new financial plays.

4. DeFi Interoperability: DeFi currently does not have perfect interoperability. If blockchain Agents can achieve transparent and traceable reasoning, it might be more attractive than the traditional internet giants' Agent browsers mentioned in the previous article.

Four, Creativity?

Framework-type projects in the future will also provide a startup opportunity similar to the GPT Store. Although currently, launching an Agent through the framework is still quite complex for ordinary users, I believe that simplifying the Agent construction process and providing some complex function combinations will still gain an advantage in the future, forming a Web3 creative economy that is more interesting than the GPT Store.

The current GPT Store is still leaning towards the practicality of traditional fields, and most popular apps are created by traditional Web2 companies, with most of the revenue being monopolized by the creators. According to OpenAI's official explanation, this strategy only provides funding support to outstanding developers in certain areas of the United States, granting a certain amount of subsidies.

From the perspective of demand, Web3 still has many areas that need to be filled, and it can also make the unfair policies of Web2 giants fairer in the economic system. In addition, we can naturally introduce community economics to improve Agents. The creative economy of Agents will be an opportunity for ordinary people to participate, and the future AI Memes will be much smarter and more interesting than the Agents issued on GOAT and Clanker.

Original link

This article is reproduced with permission from Deep Tide TechFlow.

Source