Article source: YBB Capital

Author: YBB Capital Researcher Zeke

Preface

In previous articles, we have discussed the current state of AI Memes and the future development of AI Agents many times. However, the rapid narrative development and evolution of the AI Agent track still leave one a bit overwhelmed. In just two months since the 'Truth Terminal' opened Agent Summer, the narrative of the combination of AI and Crypto has changed almost every week. Recently, the market's attention has begun to focus on 'framework-type' projects dominated by technical narratives, which have already produced several unicorns with market caps exceeding hundreds of millions or even billions in just the past few weeks. Such projects have also given rise to a new asset issuance paradigm, where projects issue tokens based on their Github code repositories, and Agents built on the framework can also issue tokens again. Based on the framework, with Agents on top, it resembles an asset issuance platform, but in fact, a unique infrastructure model exclusive to the AI era is emerging. How should we view this new trend? This article will begin with an introduction to the framework and combine personal thoughts to interpret what AI frameworks mean for Crypto.

1.What is a framework?

By definition, an AI framework is a type of underlying development tool or platform that integrates a set of pre-built modules, libraries, and tools, simplifying the process of building complex AI models. These frameworks typically also include functionalities for data processing, model training, and making predictions. In short, you can think of the framework as the operating system in the AI era, similar to Windows and Linux in desktop operating systems, or iOS and Android in mobile devices. Each framework has its own advantages and disadvantages, allowing developers to choose freely based on specific needs.

Although the term 'AI framework' is still a nascent concept in the Crypto field, its development history has been nearly 14 years since Theano was born in 2010. In the traditional AI circle, whether in academia or industry, there are already very mature frameworks available, such as Google's TensorFlow, Meta's Pytorch, Baidu's PaddlePaddle, and Byte's MagicAnimate, each with its own advantages for different scenarios.

Currently emerging framework projects in Crypto are built based on the wave of AI enthusiasm, driven by the demand for a large number of Agents, and then branched out to other Crypto tracks, ultimately forming AI frameworks under different subfields. Let's take a look at a few mainstream frameworks in the current circle to expand on this statement.

1.1 Eliza

Taking ai16z's Eliza as an example, this framework is a multi-Agent simulation framework specifically for creating, deploying, and managing autonomous AI Agents. Developed using TypeScript as the programming language, its advantage is better compatibility and easier API integration.

According to the official documentation, Eliza mainly targets scenarios in social media, such as multi-platform integration support. This framework provides comprehensive Discord integration and supports voice channels, automation accounts on X/Twitter, Telegram integration, and direct API access. In terms of media content processing, it supports reading and analyzing PDF documents, extracting and summarizing link content, audio transcription, video content processing, image analysis and description, and dialogue summarization.

Eliza currently supports four main types of use cases:

  1. AI Assistant Applications: Customer support agents, community administrators, personal assistants;

  2. Social Media Roles: Automated content creators, interactive robots, brand representatives;

  3. Knowledge Workers: Research assistants, content analysts, document processors;

  4. Interactive Roles: Role-playing roles, educational counselors, entertainment robots.

Eliza currently supports the following models:

  1. Open source model local inference: For example, Llama3, Qwen1.5, BERT;

  2. Using OpenAI's API for cloud inference;

  3. Default configuration is Nous Hermes Llama 3.1B;

  4. Integrated with Claude for complex queries.

1.2 G.A.M.E

G.A.M.E (Generative Autonomous Multimodal Entities Framework) is a multimodal AI framework for automatic generation and management launched by Virtual. It mainly targets the design of intelligent NPCs in games. A special feature of this framework is that users with low-code or even no-code backgrounds can also use it. According to its trial interface, users only need to modify parameters to participate in Agent design.

In terms of project architecture, the core design of G.A.M.E is a modular design that allows multiple subsystems to work together, as shown in the detailed architecture below.

  1. Agent Prompting Interface: The interface for developers to interact with the AI framework. Through this interface, developers can initialize a session and specify parameters such as session ID, agent ID, user ID, etc.;

  2. Perception Subsystem: The perception subsystem is responsible for receiving input information, synthesizing it, and sending it to the Strategic Planning Engine. It also processes responses from the Dialogue Processing Module;

  3. Strategic Planning Engine: The Strategic Planning Engine is the core part of the entire framework, divided into High Level Planner and Low Level Policy. The High Level Planner is responsible for setting long-term goals and plans, while the Low Level Policy translates these plans into specific action steps;

  4. World Context: The World Context includes environmental information, world status, and game status data, which help the agent understand the current situation;

  5. Dialogue Processing Module: The Dialogue Processing Module is responsible for handling messages and responses, and it can generate dialogues or reactions as output;

  6. On Chain Wallet Operator: The on-chain wallet operator may involve applications of blockchain technology, and the specific functions are unclear;

  7. Learning Module: The Learning Module learns from feedback and updates the agent's knowledge base;

  8. Working Memory: Working Memory stores the agent's recent actions, results, and current plans, among other short-term information;

  9. Long Term Memory Processor: The Long Term Memory Processor is responsible for extracting important information about the agent and its working memory, and sorting it based on importance scores, recency, and relevance;

  10. Agent Repository: The Agent Repository stores the agent's goals, reflections, experiences, and personality traits;

  11. Action Planner: The Action Planner generates specific action plans based on the low-level policies;

  12. Plan Executor: The Plan Executor is responsible for executing the action plans generated by the Action Planner.

Workflow: Developers initiate the Agent through the Agent prompting interface, and the perception subsystem receives input and passes it to the Strategic Planning Engine. The Strategic Planning Engine utilizes information from the memory system, world context, and Agent repository to formulate and execute action plans. The Learning Module continuously monitors the Agent's action results and adjusts the Agent's behavior based on those results.

Application Scenario: From the perspective of the entire technical architecture, this framework mainly focuses on the decision-making, feedback, perception, and personality of agents in virtual environments. It is applicable not only in gaming but also in the Metaverse. A large number of projects have already adopted this framework for construction, as seen in the list below Virtual.

1.3 Rig

Rig is an open-source tool written in Rust, designed to simplify the development of large language model (LLM) applications. It provides a unified operating interface, allowing developers to easily interact with multiple LLM service providers (such as OpenAI and Anthropic) and various vector databases (like MongoDB and Neo4j).

Core Features:

  • Unified Interface: Regardless of which LLM provider or vector storage, Rig can provide a consistent access method, greatly reducing the complexity of integration work;

  • Modular Architecture: The framework internally adopts a modular design, including key parts such as 'Provider Abstraction Layer', 'Vector Storage Interface', and 'Intelligent Agent System', ensuring the system's flexibility and scalability;

  • Type Safety: Utilizing Rust's features to achieve type-safe embedding operations ensures code quality and runtime safety;

  • Efficient Performance: Supports asynchronous programming patterns, optimizing concurrent processing capabilities; built-in logging and monitoring features help with maintenance and troubleshooting.

Workflow: When a user requests to enter the Rig system, it first goes through the 'Provider Abstraction Layer', which standardizes the differences between different providers and ensures consistency in error handling. Next, in the core layer, intelligent agents can call various tools or query vector storage to obtain the required information. Finally, through advanced mechanisms like Retrieval-Augmented Generation (RAG), the system can combine document retrieval and contextual understanding to generate precise and meaningful responses, returning them to the user.

Application Scenario: Rig is not only suitable for building systems that require quick and accurate answering, but can also be used to create efficient document search tools, context-aware chatbots or virtual assistants, and even support content creation, automatically generating text or other forms of content based on existing data patterns.

1.4 ZerePy

ZerePy is an open-source framework based on Python, aimed at simplifying the deployment and management of AI Agents on the X (formerly Twitter) platform. It evolved from the Zerebro project, inheriting its core functionalities but designed in a more modular and extensible manner. Its goal is to enable developers to easily create personalized AI Agents and achieve various automated tasks and content creation on X.

ZerePy provides a command-line interface (CLI) that allows users to manage and control their deployed AI Agents. Its core architecture is based on a modular design, allowing developers to flexibly integrate different functional modules, for example:

  • LLM Integration: ZerePy supports large language models (LLMs) from OpenAI and Anthropic, allowing developers to choose the model that best fits their application scenario. This enables Agents to generate high-quality text content;

  • X Platform Integration: The framework directly integrates the API of the X platform, allowing Agents to perform operations such as posting, replying, liking, and retweeting;

  • Modular Connection System: This system allows developers to easily add support for other social platforms or services, expanding the framework's functionality;

  • Memory System (Future Plans): Although the current version may not yet be fully realized, ZerePy's design goals include integrating a memory system that allows Agents to remember previous interactions and contextual information, generating more coherent and personalized content.

Although ZerePy and a16z's Eliza project are both dedicated to building and managing AI Agents, they differ slightly in architecture and goals. Eliza focuses more on multi-agent simulation and broader AI research, while ZerePy aims to simplify the deployment of AI Agents on specific social platforms (X), leaning more towards simplification in practical applications.

2.Replica of the BTC Ecosystem

In fact, in terms of development path, AI Agents share quite a few similarities with the BTC ecosystem at the end of 2023 and the beginning of 2024. The development path of the BTC ecosystem can be simply summarized as: BRC20-Atomical/Rune and other multi-protocol competition-BTC L2-BTCFi centered on Babylon. In contrast, AI Agents have developed more rapidly based on an established traditional AI technology stack, but their overall development path does indeed share many similarities with the BTC ecosystem, which I summarize as follows: GOAT/ACT-Social type Agents/analytical AI Agent framework competition. From a trend perspective, infrastructure projects focusing on the decentralization and security of Agents will likely inherit this wave of framework enthusiasm and become the main theme of the next stage.

So will this track head towards homogenization and bubble like the BTC ecosystem? I personally think not. Firstly, the narrative of AI Agents is not to replicate the history of smart contract chains. Secondly, the existing AI framework projects, whether they have real strength or have stagnated at the PPT stage or through ctrl-c and ctrl-v, at least provide a new infrastructure development idea. Many articles compare AI frameworks to asset issuance platforms, with Agents compared to assets; in fact, compared to Memecoin Launchpads and inscription protocols, I personally feel that AI frameworks resemble future public chains more, with Agents resembling future Dapps.

In today's Crypto, we have thousands of public chains and tens of thousands of Dapps. Among general chains, we have BTC, Ethereum, and various heterogeneous chains, while application chains take on more diverse forms, such as gaming chains, storage chains, and Dex chains. Public chains correspond to AI frameworks, and both are very similar, while Dapps can correspond well to Agents.

In the AI era of Crypto, it is highly likely to move towards this form, and future debates will shift from discussions about EVM and heterogeneous chains to framework battles. The question now is more about how to decentralize or chainize? I believe future AI infrastructure projects will develop on this foundation, and another point is what significance does doing this on the blockchain hold?

3.What is the significance of going on-chain?

Blockchain, regardless of what it combines with, ultimately faces one question: Is it meaningful? In an article last year, I criticized the inversion of priorities in GameFi, where Infra development was overly advanced. In previous articles about AI, I also expressed skepticism regarding the practical field's combination of AI and Crypto at this stage. After all, the driving force of narratives for traditional projects has become increasingly weak. The few traditional projects that performed well last year basically have the strength to match or exceed their token prices. What use can AI have for Crypto? Previously, I thought of ideas like Agent operating on behalf of others, Metaverse, and Agents as employees, which are relatively common but in demand ideas. However, these needs do not fully necessitate going on-chain, and from a business logic perspective, a closed loop cannot be formed. The Agent browser mentioned last time can indeed give rise to demands for data labeling, reasoning power, etc., but the combination of the two is still not tight enough, and the computing power part still tends to be dominated by centralized computing power.

Rethinking the path to success for DeFi, the reason DeFi can take a share from traditional finance is due to higher accessibility, better efficiency, lower costs, and trust-free centralized security. If thinking along this line, I think there might be several reasons to support Agent chainization.

1.Can the chainization of Agents achieve lower usage costs to achieve higher accessibility and choice? Ultimately allowing the 'rental rights' of AI unique to Web2 giants to be available to ordinary users;

2.Safety: By the simplest definition of an Agent, an AI that can be called an Agent should be able to interact with the virtual or real world. If an Agent can intervene in reality or my virtual wallet, then a blockchain-based security solution is indeed a necessity;

3.Can Agents implement a set of financial play unique to blockchain? For example, LP in AMM allows ordinary people to participate in automated market making, or Agents requiring computing power, data labeling, etc., while users invest in the protocol in the form of U when they see potential. Or can different application scenarios for Agents form new financial play;

4.DeFi currently does not have perfect interoperability. If Agents combined with blockchain can achieve transparent and traceable reasoning, they may be more attractive than the agent browsers provided by traditional internet giants mentioned in the previous article.

IV. Creativity?

Framework-type projects in the future will also provide an entrepreneurial opportunity similar to the GPT Store. Although currently, launching an Agent through a framework is still quite complex for ordinary users, I believe that simplifying the Agent construction process and offering some complex functionality combinations in frameworks will still dominate in the future, thereby forming a more interesting Web3 creative economy than the GPT Store.

Currently, the GPT Store is still more oriented towards traditional domain practicality, and most popular apps are created by traditional Web2 companies, with revenue monopolized by the creators. According to OpenAI's official explanation, this strategy provides funding support only to a few outstanding developers in the US region, offering a certain amount of subsidies.

From a demand perspective, Web3 still has many areas that need to be filled, and in terms of economic systems, it can make the unfair policies of Web2 giants more equitable. In addition, we can naturally introduce community economies to make Agents more complete. The creative economy of Agents will provide an opportunity for ordinary people to participate, and the future AI Memes will be much more intelligent and interesting than the Agents issued on GOAT or Clanker.

Reference Article:

1.Historical Evolution and Trend Exploration of AI Frameworks

2.Bybit: AI Rig Complex (ARC): AI Agent Framework

3.Deep Value Memetics: Horizontal comparison of four major Crypto×AI frameworks: adoption status, advantages and disadvantages, growth potential

4.Eliza Official Documentation

5.Virtual Official Documentation