Author: YBB Capital Researcher Zeke
Preface
In previous articles, we have discussed our views on the current state of AI Meme and the future development of AI Agents multiple times. However, the rapid narrative development and evolution of the AI Agent track have been a bit overwhelming. Since the 'Truth Terminal' began Agent Summer, there have been new changes almost every week in the narrative of AI and Crypto. Recently, market attention has again begun to focus on 'framework-type' projects driven by technical narratives, and this niche track has already produced several dark horses with market values exceeding hundreds of millions or even billions in just the past few weeks. Such projects have also spawned a new asset issuance paradigm, where projects issue tokens based on their GitHub repositories, and Agents built on frameworks can also issue tokens again. Based on the framework, with Agents on top. It resembles an asset issuance platform, yet a unique infrastructure model exclusive to the AI era is emerging. How should we view this new trend? This article will start with an introduction to the framework and combine personal thoughts to interpret what AI frameworks mean for Crypto.
One, What is a Framework?
By definition, an AI framework is a type of underlying development tool or platform that integrates a set of pre-built modules, libraries, and tools, simplifying the process of building complex AI models. These frameworks often also include functionalities for processing data, training models, and making predictions. In short, you can think of a framework as an operating system in the AI era, similar to Windows or Linux in desktop operating systems, or iOS and Android in mobile. Each framework has its own advantages and disadvantages, allowing developers to choose freely based on specific needs.
Although the term 'AI framework' is still an emerging concept in the Crypto field, looking at its origins, the development history of AI frameworks has actually approached 14 years since Theano was born in 2010. In the traditional AI circle, both academia and industry already have very mature frameworks to choose from, such as Google's TensorFlow, Meta's Pytorch, Baidu's PaddlePaddle, and Byte's MagicAnimate, each with its own advantages for different scenarios.
The framework projects emerging in Crypto are built upon the massive demand for Agents that has arisen from this wave of AI enthusiasm, and then they extend into other Crypto tracks, ultimately forming AI frameworks in different niche fields. Let's take a look at a few mainstream frameworks currently in the circle to expand on this statement.
1.1 Eliza
Taking ai16z's Eliza as an example, this framework is a multi-Agent simulation framework designed specifically for creating, deploying, and managing autonomous AI Agents. Developed using TypeScript as the programming language, its advantage lies in better compatibility and ease of API integration.
According to the official documentation, Eliza mainly targets scenarios like social media, with multi-platform integration support. The framework provides a fully functional Discord integration and supports automated accounts on X/Twitter, Telegram integration, and direct API access. In terms of media content processing, it supports reading and analyzing PDF documents, extracting and summarizing link content, audio transcription, video content processing, image analysis and description, and dialogue summarization.
The use cases currently supported by Eliza mainly fall into four categories:
AI Assistant Applications: Customer support agents, community administrators, personal assistants;
Social Media Roles: Automated content creators, interactive robots, brand representatives;
Knowledge Workers: Research assistants, content analysts, document processors;
Interactive Roles: Role-playing characters, educational tutors, entertainment robots.
Models currently supported by Eliza:
Open Source Model Local Inference: For example, Llama3, Qwen1.5, BERT;
Using OpenAI's API for cloud inference;
Default configuration is Nous Hermes Llama 3.1B;
Integrated with Claude for complex queries.
1.2 G.A.M.E
G.A.M.E (Generative Autonomous Multimodal Entities Framework) is an automated generation and management multimodal AI framework launched by Virtual, primarily designed for intelligent NPCs in games. This framework has a special feature that even users with low-code or no-code backgrounds can use it. According to its trial interface, users only need to modify parameters to participate in Agent design.
In the project architecture, the core design of G.A.M.E is a modular design where multiple subsystems work in coordination, as detailed in the following diagram.
Agent Prompting Interface: An interface for developers to interact with the AI framework. Through this interface, developers can initialize a session and specify parameters such as session ID, agent ID, user ID;
Perception Subsystem: The perception subsystem is responsible for receiving input information, synthesizing it, and sending it to the strategic planning engine. It also processes responses from the dialogue processing module;
Strategic Planning Engine: The strategic planning engine is the core part of the entire framework, divided into High-Level Planner and Low-Level Policy. The High-Level Planner is responsible for formulating long-term goals and plans, while the Low-Level Policy translates these plans into specific action steps;
World Context: The world context contains environmental information, world status, and game status data, which help agents understand the current situation they are in;
Dialogue Processing Module: The dialogue processing module is responsible for handling messages and responses, generating dialogue or reactions as output;
On Chain Wallet Operator: The on-chain wallet operator may involve applications of blockchain technology, with specific functions unclear;
Learning Module: The learning module learns from feedback and updates the agent's knowledge base;
Working Memory: Working memory stores the agent's recent actions, results, and current plans, among other short-term information;
Long Term Memory Processor: The long-term memory processor is responsible for extracting important information about the agent and its working memory, and ranking it based on factors such as importance score, recency, and relevance;
Agent Repository: The agent repository stores attributes such as the agent's goals, reflections, experiences, and personality;
Action Planner: The action planner generates specific action plans based on low-level strategies;
Plan Executor: The plan executor is responsible for executing the action plans generated by the action planner.
Workflow: Developers start the Agent through the Agent prompting interface, the perception subsystem receives input and passes it to the strategic planning engine. The strategic planning engine utilizes the memory system, world context, and information from the Agent repository to formulate and execute action plans. The learning module continuously monitors the results of the Agent's actions and adjusts the Agent's behavior based on those results.
Application Scenarios: From an overall technical architecture perspective, this framework mainly focuses on the decision-making, feedback, perception, and personality of Agents in virtual environments. Besides gaming, it is also applicable to the Metaverse. In the list below from Virtual, you can see that many projects have used this framework for construction.
1.3 Rig
Rig is an open-source tool written in Rust, designed to simplify the development of large language model (LLM) applications. It provides a unified operating interface that allows developers to easily interact with multiple LLM service providers (such as OpenAI and Anthropic) and various vector databases (like MongoDB and Neo4j).
Core Features:
Unified Interface: Regardless of which LLM provider or vector storage, Rig can provide a consistent access method, greatly reducing the complexity of integration work;
Modular Architecture: The framework adopts a modular design internally, including key parts such as 'Provider Abstraction Layer', 'Vector Storage Interface', and 'Intelligent Agent System', ensuring the flexibility and scalability of the system;
Type Safety: Type safety embedding operations are implemented using Rust's features to ensure code quality and runtime safety;
Efficient Performance: Supports asynchronous programming models, optimizing concurrency capabilities; built-in logging and monitoring functions help with maintenance and troubleshooting.
Workflow: When a user requests to enter the Rig system, they first pass through the 'Provider Abstraction Layer', which is responsible for standardizing differences between different providers and ensuring consistency in error handling. Next, in the core layer, intelligent agents can call various tools or query vector storage to obtain the required information. Finally, through advanced mechanisms like retrieval-augmented generation (RAG), the system can combine document retrieval and contextual understanding to generate precise and meaningful responses, returning them to the user.
Application Scenarios: Rig is not only suitable for building question-answering systems that require rapid and accurate responses but can also be used to create efficient document search tools, context-aware chatbots, or virtual assistants, and even support content creation, automatically generating text or other forms of content based on existing data patterns.
1.4 ZerePy
ZerePy is an open-source framework based on Python, aimed at simplifying the deployment and management of AI Agents on the X (formerly Twitter) platform. It is derived from the Zerebro project, inheriting its core functionalities but designed in a more modular and extensible manner. Its goal is to enable developers to easily create personalized AI Agents and implement various automated tasks and content creation on X.
ZerePy provides a command-line interface (CLI) that facilitates users in managing and controlling their deployed AI Agents. Its core architecture is based on a modular design, allowing developers to flexibly integrate different functional modules, such as:
LLM Integration: ZerePy supports large language models (LLM) from OpenAI and Anthropic, allowing developers to choose the model that best fits their application scenario. This enables Agents to generate high-quality text content;
X Platform Integration: The framework directly integrates with the API of the X platform, allowing Agents to perform posting, replying, liking, and retweeting operations;
Modular Connection System: This system allows developers to easily add support for other social platforms or services, expanding the framework's functionality;
Memory System (Future Plans): Although the current version may not have fully implemented this yet, ZerePy's design goals include integrating a memory system, enabling Agents to remember previous interactions and context information to generate more coherent and personalized content.
Although ZerePy and a16z's Eliza project are both committed to building and managing AI Agents, the two differ slightly in architecture and goals. Eliza focuses more on multi-agent simulation and broader AI research, while ZerePy focuses on simplifying the deployment of AI Agents on a specific social platform (X), leaning more towards practical application simplification.
Two, A replica of the BTC ecosystem
In terms of development paths, AI Agents share quite a few similarities with the BTC ecosystem at the end of 2023 and early 2024. The development path of the BTC ecosystem can be simply summarized as: BRC20-Atomical/Rune and other multi-protocol competition-BTC L2-BTCFi centered on Babylon. In contrast, AI Agents are developing more rapidly based on a mature traditional AI tech stack, but their overall development path indeed has many similarities with the BTC ecosystem, which I simply summarize as follows: GOAT/ACT-Social type Agent/Analytical AI Agent framework competition. From a trend perspective, infrastructure projects focusing on decentralization and security around Agents are likely to carry this wave of framework enthusiasm and become the main melody of the next stage.
So will this track lead to homogenization and bubbleization like the BTC ecosystem? I believe not necessarily. Firstly, the narrative of AI Agents is not to replicate the history of smart contract chains. Secondly, existing AI framework projects, whether they are genuinely capable or stagnating at the PPT stage or ctrl+c+ctrl+v, at least offer a new infrastructure development idea. Many articles compare AI frameworks to asset issuance platforms, with Agents as assets. Compared to Memecoin Launchpad and inscription protocols, I personally feel that AI frameworks resemble future public chains, with Agents resembling future Dapps.
Currently, there are thousands of public chains and tens of thousands of Dapps in the Crypto space. Among general chains, we have BTC, Ethereum, and various heterogeneous chains, while application chains are more diverse, such as gaming chains, storage chains, and Dex chains. Public chains correspond to AI frameworks, and Dapps can correspond well to Agents.
In the Crypto era of AI, it is highly likely that it will move towards this form. Future debates will shift from discussions about EVM and heterogeneous chains to framework disputes. The current question is more about how to decentralize or chainize? I believe subsequent AI infrastructure projects will develop based on this foundation, and another point is what significance does doing this on the blockchain hold?
Three, What is the significance of being on-chain?
Blockchain, no matter what it combines with, ultimately faces a question: Is it meaningful? In last year's article, I criticized the misplaced priorities of GameFi and the premature transition of Infra development. In previous articles about AI, I also expressed skepticism about the current practical field combinations of AI x Crypto. After all, the driving force of narratives for traditional projects has become weaker. The few traditional projects whose token prices performed well last year also need to have strengths that match or exceed their token prices. What can AI do for Crypto? Previously, I thought of ideas such as Agents performing operations to realize intentions, the Metaverse, and Agents as employees, which are relatively mundane yet in demand. However, none of these needs fully require on-chain implementation and cannot close the loop from a business logic perspective. The Agent browser mentioned in the previous issue can realize intentions, which can give rise to demands for data labeling, reasoning power, etc., but the combination of the two is still not tight enough, and the computing power part is still dominated by centralized computing power from various perspectives.
Reconsidering the path to success for DeFi, the reason DeFi can carve out a piece from traditional finance is due to its higher accessibility, better efficiency, lower costs, and the need for no trust in centralized security. If we think along these lines, I believe there may be several reasons supporting Agent chainization.
1. Can the chainization of Agents achieve lower usage costs, thereby reaching higher accessibility and choice, ultimately allowing ordinary users to participate in the AI 'rental rights' that belong to Web2 giants?
2. Security: According to the simplest definition of an Agent, an AI that can be called an Agent should be able to interact with the virtual or real world. If an Agent can intervene in reality or my virtual wallet, then a blockchain-based security solution is a necessity;
3. Can Agents create a set of financial play unique to blockchain? For example, LP in AMM allows ordinary people to participate in automated market-making. Alternatively, Agents that require computing power, data labeling, etc., can enable users to invest in the protocol in the form of U when they are optimistic. Or can Agents in different application scenarios form new financial plays;
4. DeFi currently lacks perfect interoperability. If Agents combined with blockchain can achieve transparent and traceable reasoning, they may be more attractive than the traditional internet giants' agent browsers mentioned in the previous article.
Four, Creativity?
Framework-type projects in the future will also provide a startup opportunity similar to the GPT Store. Although currently, releasing an Agent through a framework is still quite complex for ordinary users, I believe that simplifying the Agent construction process and providing combinations of some complex functionalities will still hold an advantage in the future, thus forming a more interesting Web3 creative economy than the GPT Store.
The current GPT Store still leans towards practicality in traditional fields, and most popular applications are created by traditional Web2 companies, with income dominated by creators. According to OpenAI's official explanation, this strategy provides funding support only to a few outstanding developers in the US, offering a certain amount of subsidies.
From a demand perspective, Web3 still has many aspects that need to be filled, and its economic system can also make the unfair policies of Web2 giants more equitable. Besides, we can naturally introduce community economy to make Agents more complete. The creative economy of Agents will be an opportunity for ordinary people to participate, and the future AI Meme will be much smarter and more interesting than the Agents issued on GOAT or Clanker.
Reference Articles:
1. The historical evolution and trend exploration of AI frameworks
2. Bybit: AI Rig Complex (ARC): AI agent framework
3. Deep Value Memetics: A horizontal comparison of the four major Crypto × AI frameworks: adopting conditions, strengths and weaknesses, growth potential
4. Eliza Official Documentation
5. Virtual official documentation