1. What is Vana?
Vana is a distributed private user data network aimed at achieving user-owned AI. Users own and manage the AI models they contribute and profit from them. Developers can access cross-platform data to support personalized applications and train cutting-edge AI models.
Vana originated from a research project at MIT in 2018, dedicated to enabling users to own their data and the AI models they create. It is completely open source and operates in a decentralized network without the need for permission. The Open Data Foundation is committed to promoting the large-scale adoption of the Vana protocol, while other contributing organizations like Corsali focus on research and development.
In the Web2 era, platform companies commercialized by collecting user data without compensation. For example, ByteDance's Toutiao earned hundreds of billions of RMB in advertising revenue in just one year. Even though content creators received some share on the platform, the core benefits remained firmly controlled by the company.
By the time of the Web3 era, this phenomenon had intensified. Reddit earned $203 million in 2024 simply by signing data authorization agreements with AI companies, yet users who contributed content to the platform received almost no benefits. This imbalance fueled the birth of Vana.
2. Technical Architecture
Just as Bitcoin achieved trustless value transfer and Ethereum achieved programmable state transitions, Vana achieves programmable data ownership, with the core principle being personal data sovereignty.
2.1 The Double Spend Problem of Data
The core challenge of assetizing data is that, unlike other digital assets, the economic value of data depends on controlled access - once data becomes public, its market value is lost. Traditional blockchains emphasize public verification, making them unsuitable for handling private data. Vana addresses this issue by combining private data hosting with public ownership records.
The blockchain maintains the following global state:
• Data ownership records: Encrypted proof of data holding
• Access rights: Who can access which data, and under what conditions
• Validation proof: Evidence of data quality and authenticity
• On-chain data collective contracts and token balances: Economic rights and governance
Although the data itself is stored on encrypted personal servers or trusted secure areas, blockchain allows for programmable control over data access conditions and how benefits flow back to the data creators.
2.2 Core Components
Personal servers provide a secure foundation for data sovereignty, allowing users to choose to run these servers on local devices, trusted service providers, or lightweight clients.
Data Liquidity Pools (DLPs) serve as a coordination layer for collective data assets, managing data verification rules, access rights, and token distribution.
Trusted secure areas provide a trusted execution environment (TEE) for private computation, completing complex operations without exposing data.
2.3 State Transitions in the Data Economy
Vana resolves the 'double spend' problem of data through a combination of privacy protection and programmable access rights. Data transactions are treated as state transitions, with each transaction updating the global state while handling data and economic affairs.
These technological foundations support the subsequent creation, governance, and large-scale monetization of collective data assets.
3. Vana's Vision
Vana's goal is to break the existing data economy model, allowing users to regain ownership and control over their own data, and to directly profit from training AI models with their data, becoming the masters of their own data.