Original Author: YBB Capital Researcher Zeke
Preface
In previous articles, we have explored views on the current state of AI Meme and the future development of AI Agents multiple times. However, the rapid narrative development and dramatic evolution of the AI Agent sector can be quite overwhelming. Since the 'truth terminal' opened the Agent Summer, there have been new changes to the narrative of AI combined with Crypto almost every week in the past two months. Recently, the market's attention has once again focused on 'framework-type' projects dominated by technical narratives. This segmented track has already produced several dark horses with market values exceeding hundreds of millions or even tens of billions in just the past few weeks. Such projects have also given rise to a new asset issuance paradigm, where projects issue tokens based on Github repositories, and Agents built on frameworks can also issue tokens again. With frameworks at the base and Agents on top, it resembles an asset issuance platform, yet a unique infrastructure model exclusive to the AI era is beginning to emerge. How should we view this new trend? This article will begin with an introduction to frameworks and, combined with my thoughts, interpret what AI frameworks mean for Crypto.
1. What is a framework?
By definition, an AI framework is a type of underlying development tool or platform that integrates a set of pre-built modules, libraries, and tools, simplifying the process of building complex AI models. These frameworks typically also include functionalities for processing data, training models, and making predictions. In short, you can simply understand frameworks as operating systems in the AI era, similar to desktop operating systems like Windows and Linux, or mobile systems like iOS and Android. Each framework has its own strengths and weaknesses, allowing developers to freely choose based on specific needs.
Although the term 'AI framework' is still an emerging concept in the Crypto field, its development history has been nearly 14 years since the birth of Theano in 2010. In the traditional AI circle, both academia and industry already have very mature frameworks available for selection, such as Google's TensorFlow, Meta's Pytorch, Baidu's PaddlePaddle, and ByteDance's MagicAnimate, with each framework having its own advantages for different scenarios.
The framework projects emerging in Crypto are built in response to the surge in demand for Agents driven by this wave of AI enthusiasm, and have subsequently branched out into other Crypto sectors, ultimately forming AI frameworks in various subdivided fields. Taking a few mainstream frameworks in the current circle as examples, let's expand on this statement.
1.1 Eliza
Taking ai16z's Eliza as an example, this framework is a multi-Agent simulation framework specifically designed for creating, deploying, and managing autonomous AI Agents. Developed using TypeScript as the programming language, its advantage lies in better compatibility, making API integration easier.
According to official documentation, Eliza primarily targets scenarios in social media, such as multi-platform integration support. This framework provides a fully functional Discord integration supporting voice channels, automated accounts on the X/Twitter platform, Telegram integration, and direct API access. In terms of media content processing, it supports reading and analyzing PDF documents, extracting and summarizing link content, audio transcription, video content processing, image analysis and descriptions, and dialogue summarization.
The use cases currently supported by Eliza mainly fall into four categories:
AI assistant applications: customer support agents, community managers, personal assistants;
Social media roles: automatic content creators, interactive bots, brand representatives;
Knowledge workers: research assistants, content analysts, document processors;
Interactive roles: role-playing characters, educational counselors, entertainment robots.
Models currently supported by Eliza:
Open-source model local inference: for example, Llama 3, Qwen 1.5, BERT;
Using OpenAI's API for cloud inference;
Default configuration is Nous Hermes Llama 3.1 B;
Integrating with Claude for complex queries.
1.2 G.A.M.E
G.A.M.E (Generative Autonomous Multimodal Entities Framework) is a multimodal AI framework for automatic generation and management launched by Virtual, mainly targeting intelligent NPC design in games. A special aspect of this framework is that even users with low-code or no-code backgrounds can participate in Agent design, as users only need to modify parameters based on its trial interface.
In terms of project architecture, the core design of G.A.M.E is a modular design that allows multiple subsystems to work together, detailed architecture is shown in the diagram below.
Agent Prompting Interface: The interface through which developers interact with the AI framework. Through this interface, developers can initialize a session and specify parameters such as session ID, Agent ID, and user ID;
Perception Subsystem: The perception subsystem is responsible for receiving input information, synthesizing it, and sending it to the strategic planning engine. It also handles responses from the dialogue processing module;
Strategic Planning Engine: The strategic planning engine is the core part of the entire framework, divided into high-level planners and low-level policies. The high-level planner is responsible for setting long-term goals and plans, while the low-level policies translate these plans into specific action steps.
World Context: The world context includes environmental information, world states, and game states, which help Agents understand the current situation;
Dialogue Processing Module: The dialogue processing module is responsible for handling messages and responses, which can generate dialogues or reactions as output;
On Chain Wallet Operator: The on-chain wallet operator may involve applications of blockchain technology, with specific functions unspecified;
Learning Module: The learning module learns from feedback and updates the Agent's knowledge base;
Working Memory: Working memory stores recent actions, results, and current plans of the Agent as short-term information;
Long Term Memory Processor: The long-term memory processor is responsible for extracting important information about the Agent and its working memory, ranking based on factors such as importance score, recency, and relevance;
Agent Repository: The agent repository stores the Agent's goals, reflections, experiences, and personality attributes;
Action Planner: The action planner generates specific action plans based on low-level policies;
Plan Executor: The plan executor is responsible for executing the action plans generated by the action planner.
Workflow: Developers start the Agent through the Agent prompt interface, where the perception subsystem receives input and passes it to the strategic planning engine. The strategic planning engine uses memory systems, world context, and information from the Agent library to formulate and execute action plans. The learning module continuously monitors the Agent's action results and adjusts the Agent's behavior based on the results.
Application scenarios: From the overall technology architecture, this framework mainly focuses on the decision-making, feedback, perception, and personality of Agents in virtual environments. Besides gaming, it is also applicable in the Metaverse, and a large number of projects have already adopted this framework for construction, as seen in the list under Virtual.
1.3 Rig
Rig is an open-source tool written in Rust, designed to simplify the development of applications using large language models (LLM). It provides a unified operating interface, allowing developers to easily interact with multiple LLM service providers (such as OpenAI and Anthropic) and various vector databases (such as MongoDB and Neo 4 j).
Core Features:
Unified Interface: Regardless of which LLM provider or vector storage is used, Rig provides a consistent access method, greatly reducing the complexity of integration work;
Modular Architecture: The framework uses a modular design internally, containing key parts such as 'Provider Abstraction Layer,' 'Vector Storage Interface,' and 'Intelligent Agent System,' ensuring system flexibility and scalability;
Type Safety: Type-safe embedding operations are achieved using Rust's features, ensuring code quality and runtime safety;
Efficient Performance: Supports asynchronous programming patterns, optimizing concurrent processing capabilities; built-in logging and monitoring functions help with maintenance and troubleshooting.
Workflow: When a user requests to enter the Rig system, they first pass through the 'Provider Abstraction Layer,' which is responsible for standardizing differences between different providers and ensuring consistency in error handling. Next, in the core layer, intelligent agents can call various tools or query vector storage to obtain the required information. Finally, through advanced mechanisms such as Retrieval-Augmented Generation (RAG), the system can combine document retrieval and context understanding to generate precise and meaningful responses, which are then returned to the user.
Application scenarios: Rig is applicable not only for building systems that require quick and accurate answers but also for creating efficient document search tools, context-aware chatbots or virtual assistants, and even supporting content creation by automatically generating text or other forms of content based on existing data patterns.
1.4 ZerePy
ZerePy is an open-source framework based on Python, designed to simplify the process of deploying and managing AI Agents on the X (formerly Twitter) platform. It is derived from the Zerebro project, inheriting its core functionalities but designed in a more modular and extensible way. Its goal is to enable developers to easily create personalized AI Agents and implement various automation tasks and content creation on X.
ZerePy provides a command-line interface (CLI) for users to manage and control their deployed AI Agents. Its core architecture is based on modular design, allowing developers to flexibly integrate different functional modules, such as:
LLM Integration: ZerePy supports large language models (LLM) from OpenAI and Anthropic, allowing developers to choose the model that best fits their application scenario. This enables Agents to generate high-quality text content;
X Platform Integration: The framework integrates directly with the X platform's API, allowing Agents to perform actions such as posting, replying, liking, and retweeting;
Modular Connection System: This system allows developers to easily add support for other social platforms or services, expanding the framework's functionality;
Memory System (Future Plan): Although the current version may not be fully implemented, ZerePy's design goal includes integrating a memory system, enabling Agents to remember previous interactions and context information, thereby generating more coherent and personalized content.
Although ZerePy and a16z's Eliza project both aim to build and manage AI Agents, they differ slightly in architecture and goals. Eliza focuses more on multi-agent simulation and broader AI research, while ZerePy is more focused on simplifying the deployment of AI Agents on a specific social platform (X), leaning more towards simplification in practical applications.
2. The BTC ecosystem's replica
In terms of development paths, AI Agents and the BTC ecosystem at the end of 2023 and the beginning of 2024 have quite a few similarities. The development path of the BTC ecosystem can be simply summarized as: BRC 20-Atomical/Rune and other multi-protocol competition-BTC L2-BTCFi centered around Babylon. In contrast, AI Agents are developing more rapidly based on a mature traditional AI tech stack, but their overall development path indeed shares many similarities with the BTC ecosystem, which I summarize as follows: GOAT/ACT-Social type Agent/analytical AI Agent framework competition. From a trend perspective, infrastructure projects focusing on decentralization and security around Agents are likely to also ride the wave of framework enthusiasm, becoming the main theme of the next stage.
Will this sector head towards homogenization and bubble formation like the BTC ecosystem? I believe not. First, the narrative of AI Agents is not meant to replicate the history of smart contract chains. Secondly, existing AI framework projects, whether truly capable or stagnating in the PPT stage or just ctrl c+ctrl v, at least provide a new infrastructure development idea. Many articles compare AI frameworks to asset issuance platforms, with Agents likened to assets. Compared to Memecoin Launchpad and inscription protocols, I personally feel that AI frameworks are more like future public chains, and Agents more like future Dapps.
In today's Crypto landscape, we have thousands of public chains and tens of thousands of Dapps. Among general chains, we have BTC, Ethereum, and various heterogeneous chains, while application chains take on more diverse forms, such as gaming chains, storage chains, and Dex chains. Public chains correspond to AI frameworks, and both are quite similar, while Dapps can correspond well to Agents.
In the AI era of Crypto, it is highly likely to move towards this form. Future debates will shift from discussions of EVM and heterogeneous chains to framework wars. The current question is more about how to decentralize or chainify? I think subsequent AI infrastructure projects will expand on this basis, and another question is what significance does doing this on the blockchain hold?
3. What is the significance of going on-chain?
No matter what blockchain combines with, it ultimately faces a question: is it meaningful? In last year's article, I criticized the inversion of GameFi's priorities and the premature transition of Infra development. In the previous issues about AI, I also expressed my skepticism regarding the combination of AI x Crypto in practical fields at this stage. After all, the driving force of narratives for traditional projects is becoming weaker. Among the few traditional projects that performed well in terms of token prices last year, they had to possess the strength to match or exceed their token prices. What utility can AI provide for Crypto? Previously, I thought of ideas such as Agents acting on intentions, the Metaverse, or Agents as employees, which are relatively mundane but have demands. However, these demands do not fully necessitate going on-chain and cannot form a closed loop in terms of business logic. The Agent browser mentioned in the previous issue can realize intentions, but it can also give rise to demands for data labeling, reasoning power, etc. However, the combination of the two is still not tight enough, and the overall computing power still favors centralized power.
Reconsidering the path to DeFi's success: DeFi's ability to gain a slice of traditional finance is due to its higher accessibility, better efficiency, and lower costs, without the need for trust in centralized security. Following this line of thought, I think there may be several reasons supporting the chaining of Agents.
1. Can the chaining of Agents achieve lower usage costs to reach higher accessibility and options? Ultimately allowing ordinary users to participate in the AI 'rental rights' that belong to the Web2 giants;
2. Security: According to the simplest definition of an Agent, an AI that can be called an Agent should be able to interact with the virtual or real world. If an Agent can intervene in reality or my virtual wallet, then a blockchain-based security solution is indeed a necessity.
3. Can Agents achieve a unique financial play exclusive to blockchain? For example, in AMM's LP, allowing ordinary people to participate in automated market-making, where Agents may require computing power, data labeling, etc., while users invest in the protocol in the form of U when they are optimistic. Or could Agents based on different application scenarios form new financial plays?
4. DeFi currently lacks perfect interoperability. If Agents based on blockchain can achieve transparent and traceable inference, they may be more attractive than the traditional internet giants' agent browsers mentioned in the previous article.
4. Creativity?
Framework projects in the future will also provide entrepreneurial opportunities similar to the GPT Store. Although currently deploying an Agent through a framework is still quite complex for ordinary users, I believe simplifying the Agent construction process and providing some complex function combinations will prevail in the future, forming a more interesting Web3 creative economy than the GPT Store.
Currently, the GPT Store still leans towards traditional practical applications, and most popular Apps are created by traditional Web2 companies, with income monopolized by creators. According to OpenAI's official explanation, this strategy only provides funding support to certain outstanding developers in the U.S. region, offering a certain amount of subsidies.
From a demand perspective, there are still many areas in Web3 that need to be filled, and in terms of the economic system, it can also make the unfair policies of Web2 giants more equitable. Additionally, we can naturally introduce community economics to make Agents more complete. The creative economy of Agents will be an opportunity for ordinary people to participate, while the future AI Meme will be far more intelligent and interesting than the Agents released on GOAT and Clanker.
References:
1. Evolution of AI framework history and trend exploration
2. Bybit: AI Rig Complex (ARC): AI Agent Framework
3. Deep Value Memetics: A horizontal comparison of the four major Crypto×AI frameworks: using conditions, strengths and weaknesses, growth potential
4. Eliza Official Documentation
5. Virtual Official Documentation