Author: Teng Yan, Chain of Thought; Translation: Jinse Finance Xiaozou
1. Data Gold Rush
During the California gold rush of the mid-19th century, thousands pursued the hope of untold wealth in the new frontier.
Suddenly, the poor found themselves wealthy, and stories of self-made success became commonplace, driving the emergence of more industries and cities. Infrastructure developed at an astonishing rate, reshaping the face of America.
The similarities between Crypto AI and the gold rush are hard to ignore.
Today, most Crypto AI products are still in development, either running on testnets, indicating that we are in the infrastructure-building phase.
Investors and builders are preparing for a potential surge in growth. The tools, networks, and protocols being created now could lay the foundation for a massive decentralized AI ecosystem.
We are witnessing an early stage of a digital gold rush - this gold rush may be just as transformative as the one in the 19th century.
So you can imagine how surprised I was when I stumbled upon a Crypto AI project claiming over 700,000 daily active users. Not monthly active users, but daily active users. In such an emerging field, such user metrics are practically unheard of. So I had to dig deep to figure out what was happening behind the scenes.
What is this project? DIN, short for 'Data Intelligence Network.'
2. Crypto Data Networks
I have been closely following data networks in the Crypto AI space, and it is clear that they are addressing a critical pain point in the AI field: access to valuable datasets.
Today, many of the most valuable data sources are tightly controlled by centralized entities that charge high access fees.
For example:
Reddit signed a $60 million annual licensing agreement with OpenAI to provide access to its user-generated content.
X (formerly Twitter) no longer offers free API access to developers, with the cost of accessing Twitter data now ranging from $100 to $42,000 per month (not a joke).
The message conveyed here is clear: businesses recognize that data is the new battlefield, and they are locking in control to maximize profits.
Crypto provides a potential solution - a way to break free from centralized control over valuable datasets.
Crypto data networks adopt a completely different approach aimed at building high-quality decentralized datasets, without the bottlenecks imposed by traditional models. By using tokens, these networks can incentivize large-scale data labeling work, encouraging individuals to contribute to massive data collection and even organize network scraping of training data.
Blockchain provides transparency and creates a framework for tracking data ownership and provenance. This ensures that contributors are fairly compensated whenever their data is used, establishing a new paradigm where the value of data is shared rather than monopolized.
3. DIN Vision
DIN is a team that proactively addresses data issues head-on.
At the core of DIN is a data layer that collects and validates on-chain and off-chain data, using blockchain as a settlement layer.
What is the main idea? It is to return ownership of data to users, rewarding them for their contributions to the system.
How DIN works:
This chart might look complex at first glance, so let's break it down.
The DIN network has three main participants:
Data Collectors
Data Validators
Computing Nodes
To better understand how data collectors and validators work, let's dive into xData, which is DIN's current flagship product.
(1) xData: Data Collection
xData is DIN's flagship platform, primarily used to collect, organize, and store data from social media platforms like X without relying on APIs. It runs on a decentralized network, ensuring user ownership and privacy. It was launched on April 2024 on opBNB (an L2 of the BNB chain).
xData gamifies data collection for users, making it fun and profitable. Let's quickly take a look at how it works:
Users install a browser plugin, log in with their wallet, and link their X accounts.
Users can tag interesting tweets by replying to tweets and tagging accounts.
Users can earn 'wafers' points by tagging tweets, which can be converted to tokens at the TGE.
There are several gamification mechanisms here. Each user has a limited number of tweets they can tag (store), but they can increase their storage space by consuming wafers points. Users must also consume wafers every 24 hours to maintain their account's 'unlocked' status to earn more wafers.
DIN publishes tasks around specific keywords or tags, and community members search tweets in real-time and tag them based on specific tags.
The permissionless nature of xData means that any user worldwide can participate in data collection and annotation to earn rewards/income, regardless of nationality. Currently, data collection is done off-chain, with tagged tweets stored on BNB green field, a decentralized data layer on the BNB chain.
(2) Chipper Nodes: Data Validation
The next question naturally is: how to ensure the quality and completeness of the data submitted by users? After all, someone could run a bot to maximize their benefits by randomly tagging tweets that don't match the specified tags.
Data labeling is not always straightforward. Tweets often contain nicknames, slang, and cultural factors - for example, Bitcoin is often referred to as '大饼' in Chinese tweets.
This is where data validation comes into play.
Chipper nodes are DIN's AI-driven data validation and processing nodes responsible for validating and vectorizing data, while also allowing users to earn tokens (xDIN and DIN).
Behind the scenes, each user's operating node is actually running a small AI model locally to verify whether the content of tweets matches the attached tags, and then stores it in the decentralized data layer. Users can operate these nodes on standard PCs without needing expensive hardware setups.
As the number of verified data processed increases, the AI models used by validators continuously improve, making the network smarter and more accurate over time.
Currently, DIN can internally handle all data validation, but the goal is to decentralize the validation process. Active testing of nodes is underway. Users can run node software on their local devices to test the network, and DIN is preparing to launch its mainnet and tokens in the coming weeks, with bug rewards already in place.
(3) Computing Nodes
Although the computing nodes have not yet been put into use, they are part of DIN's future privacy plan for securely storing data. Here's how computing nodes work:
Vector Conversion: Computing nodes convert validated data into vectors.
Privacy Processing: Vectors are processed through ZK (Zero Knowledge) processors to ensure privacy.
Final Data Certainty: Finalized datasets and vectors are stored in IPFS for third-party access.
A new L2 on the BNB chain?
The official announcement has not yet been released, but in our research, we found a DIN token on the BNB chain testnet. This hints at future developments in the blockchain - possibly a sidechain or L2 solution on the BNB chain.
Introduction to DIN:
DIN feels like a new player, but the project's origins date back to the end of 2021. It was originally launched under the name 'Web3Go' as an on-chain data analytics platform in the Polkadot ecosystem, funded by the Web3 Foundation, and collaborated with clients like Moonbeam and Oak Network.
In 2022, the team expanded its business scope to the BNB chain ecosystem, joining Binance Labs' MVB incubator and securing the funding needed to develop a 'multi-chain open-source data analytics platform.'
By July 2023, they saw signs: generative AI is booming, and the demand for robust data infrastructure is more urgent than ever. The team then turned to building a comprehensive 'AI Data Intelligence Layer' to align their mission with the data needs of AI innovation. This evolution peaked in May 2024, when Web3Go officially rebranded as DIN, marking a bold focus on data and signaling that the data layer will be key to the next wave of AI advancements.
4. DIN's Traction - Good momentum so far
The daily user count on opBNB is about 700,000.
The daily transaction volume of DIN on opBNB is about 1.2 million.
According to DappBay, DIN performed steadily in October, with an average daily user count exceeding 700,000 and daily transaction volume exceeding 1.2 million. Most transactions are due to xData users needing to perform an on-chain transaction every 24 hours to activate their xData app and earn points.
DIN has consistently ranked among the top ten dApps on the BNB chain and often ranks as the number one application by user count on the network. While I haven't tracked the BNB chain ecosystem as closely as I have with Solana and Base, this is no small achievement, especially considering the launch timing of the BNB chain and Binance's strong support.
To better understand, I analyzed some of the other top-ranking applications on the BNB chain to see what has shaped user stickiness:
Vooi (DeFi) is a perp DEX aggregator.
Particle Network (Infrastructure) is a full-chain protocol in the testnet.
Revox (Infrastructure) is a modular on-chain network with a popular content app - ReadON.
SERAPH (Game) is a Souls-like RPG.
MyShell is a no-code AI app store ecosystem.
According to the team, DIN has collected and labeled over 100 million tweets so far, with a user base of over 30 million on opBNB and Mantle.
It is worth noting that DIN can leverage its large user base to quickly generate real-time datasets of relevant tweets. This process is entirely independent of the X API.
While xData currently focuses on Twitter, the team plans to expand the data collection and labeling platform to other data sources like Reddit, Facebook, Instagram, and any user data platform with high-value information. To me, this is where the real gold lies.
Reiki:
Reiki is another product from DIN, closely linked to the ongoing AI agents meta. In fact, given the potential consumer interest in AI agents we've seen in recent weeks on Truth Terminal and GOAT, DIN may already be ahead of the curve.
In January 2024, DIN launched the Reiki platform, allowing users to create AI agents (mainly chatbots) without any coding experience. Users can also integrate their own knowledge bases to build engaging, personalized chatbots, reminiscent of MyShell.
The platform quickly gained attention upon release, becoming the number one product on Product Hunt.
Reiki also provides creators with various methods to monetize their robots, participate in reward programs, and even turn their robots into NFTs - adding an interesting layer of ownership to the gaming experience. Notably, the Discord knowledge support bot on the BNB Chain is powered by Reiki.
While the platform is currently mostly deprecated, the DIN team does not rule out the possibility of bringing it back after they release their tokens. If reactivated, Reiki could provide additional utility for the tokens and offer AI agent creators a way to leverage the data collected by xData.
5. Token Design: xDIN, DIN, and Node Sales
From August to September 2024, DIN held a Chipper node sale and raised $2.5 million. These Chipper nodes will allow users to run validation software on their local devices, using models to ensure data is accurately labeled. The sale was very successful, with all 25,112 secondary nodes (each priced at $99) sold out.
Supply Side:
Before the TGE, xData users can exchange their wafers points for xDIN - the pre-airdrop token. However, there will be a 5-30% exchange fee, which will be allocated to Chipper node owners. This exchange mechanism has not yet gone live, but it is expected to be activated immediately after the node 'pre-mining' launches later this month.
During the TGE, users will receive DIN (tradeable tokens) airdrops based on their proportion of held xDIN, fully unlocked, with no complex lock-up mechanisms.
After the TGE, 25% of the total supply of DIN tokens will be reserved for Chipper node rewards. Half of this quota will be released in the first year, with the remaining halved each year afterward.
It is worth noting that compared to other projects conducting node sales, this sale has a relatively fast unlocking speed, as other projects distribute node rewards gradually over 3-4 years.
Demand Side:
Validator nodes may need to stake DIN tokens to participate in the network. In return, they will earn rewards for validating data, but they will face penalties if their output is inaccurate.
On the other end, data users must use DIN tokens to access network data. Since most Web2 businesses are still hesitant to use cryptocurrencies, the company will need to facilitate these transactions to bridge traditional businesses and decentralized networks.
We are still waiting for the detailed DIN tokenomics to be released, which should come closer to the TGE.
Team and Funding:
DIN's core team is composed of talent from Columbia University, University College London, and the University of Stuttgart, who have up to ten years of expertise in AI and blockchain.
DIN's founder, Hao Ding, holds a master's degree in information technology from the University of Stuttgart. Before delving into cryptocurrencies, he served as the R&D director at the Suzhou Artificial Intelligence Research Institute in China, then became the vice president of the identity oracle network Litentry, and later founded Web3Go.
I was excited to meet Hao in person, and we had a great conversation about the future of AI. If you ask me what his belief is? It's that data will be at the core of everything. The DIN team currently has 16 members, most of whom are engineers.
DIN participated in Binance Lab's MVB 5 accelerator program and raised $4 million in seed funding in July 2023, led by Binance Labs, HashKey, NGC, and Shima Capital. In August 2024, DIN secured another $4 million in funding from Manta Network, Moonbeam Network, Ankr, and Maxx Capital, bringing its total funding to $8 million.
6. Our Thoughts
Idea 1: Creating a decentralized Scale AI is an interesting endeavor.
Data collection and labeling is a big business.
Scale AI is the best-known player in this field, with an annual recurring revenue of about $1 billion. This is driven by significant demand from foundational AI model companies like OpenAI, Anthropic, and Cohere, which are Scale's main clients. As of May 2024, the company's valuation has reached up to $14 billion.
Let's take a closer look at Scale AI's business model.
Scale's data labeling tasks rely on a vast distributed workforce, including manual tagging of videos, categorizing photos, and transcribing audio.
The company employs about 240,000 workers across multiple countries and actively recruits in areas with high unemployment and low living costs. For instance, Kenya has become an important recruitment hub in Africa, with a 'boot camp' for in-person training in Nairobi, and targeted paid advertisements to attract workers.
The labeling process typically has two layers: the first layer is the annotators, who label the data from scratch; the second layer is the quality controllers, who check the work, add missing labels, and correct errors. This is a labor-intensive job, but it is very effective because labor costs are low, and clients are willing to pay handsomely.
Now, imagine scaling this model through a decentralized network. Token-incentivized, permissionless workers worldwide could allow anyone to participate, while a distributed validation network could ensure the accuracy and quality of the data. Decentralization could open up new possibilities for scaling data labeling, turning it into a truly global democratization process.
Idea 2: Large user base = good thing
DIN's main advantage today lies in its large, sticky community, built through over two years of focused community-building efforts. With such a network, DIN can quickly mobilize data collection based on specific criteria. However, the challenge lies in identifying where the actual data demand is, guiding users to collect and label the right datasets, and establishing sustainable revenue streams to support long-term growth.
Idea 3: Incentives are a double-edged sword
Currently, most user stickiness is driven by expectations of token rewards after the token is issued. However, if the team cannot generate sufficient demand for the token post-issue, the token's usage rate may decline as initial interest wanes. Creating this demand requires speculative interest and building a data consumer market composed of consumers eager to purchase these datasets.
Idea 4: Data labeling is a competitive field.
DIN is not the only crypto team competing for this market share - projects like Sapiens, Grass, and Masa are also in the race. But the pie is huge. For example, GRASS currently has a market cap of $2.5 billion, highlighting the tremendous opportunities in the industry.
One way DIN may differentiate itself from competitors is by training and deploying proprietary AI models for data validation, reducing reliance on human labor. This automation-first approach can streamline operations, enhance scalability, and give DIN an edge over competitors that still heavily rely on manual operations.
7. Conclusion
Data networks are one of the most exciting battlegrounds at the intersection of AI and crypto. Unlike traditional centralized models, Crypto-driven data networks leverage decentralized participation and incentive mechanisms to build high-quality datasets at scale.
DIN positions itself as a pioneer in the field, and witnessing the project's development will be fascinating. This is an opportunity that DIN needs to seize. I often tell people: the data network is one of the smartest areas to build right now.
Crypto is reshaping the way data is collected, validated, and monetized, laying the foundation for a new decentralized data economy.