Rachel, Golden Finance

On November 27, Zhao Changpeng stated on X that tasks such as AI data labeling are very suitable for completion via blockchain, leveraging global low-cost labor and enabling instant payment through cryptocurrencies, breaking geographic limitations.

Data labeling refers to the manual or automated annotation of raw data (such as text, images, audio, etc.) to give it specific structured information. Labeled data is used to train machine learning or artificial intelligence models. For instance, labeling text with sentiment categories (positive, negative, neutral) is a form of data labeling. Utilizing blockchain for artificial intelligence data labeling is particularly suitable for scenarios that require high transparency, credibility, and distributed collaboration. This can not only enhance the efficiency and quality of data labeling but also create new possibilities for global collaboration and data trading.

Currently, what high-quality projects are in this track? What are the prospects for the track's development?

The role of blockchain in AI data labeling

Blockchain is a decentralized distributed ledger technology characterized by transparency, immutability, and traceability. These characteristics can address the following issues in traditional data labeling methods:

  • Data authenticity and tamper-proofing: Each labeling record is written to the blockchain and cannot be arbitrarily changed, ensuring the credibility of the labeling.

  • Task allocation transparency: Blockchain can record the distribution, execution, and audit processes of tasks, preventing unfair task allocation or result tampering.

  • Incentive mechanism: Using blockchain's smart contract technology, data labelers can automatically receive cryptocurrencies or other rewards for completing tasks.

  • Data traceability: The source of each label, as well as information about labelers and auditors, can be tracked.

Application scenarios

  • Distributed labeling: Using blockchain to assign data labeling tasks to labelers worldwide, enhancing data processing efficiency.

  • Quality audit: The results of multiple labelers are compared and audited using blockchain technology to ensure labeling accuracy.

  • Labeled data trading: Well-labeled data can be traded on the blockchain, with both buyers and sellers not needing to worry about the integrity or authenticity of the data.

  • Privacy protection: Using blockchain for encrypted storage of labeled data to ensure the security of private data.

Related projects

  • OORT DataHub: Provides blockchain-based decentralized data labeling services, using the Proof of Honesty algorithm for quality control. Its platform distributes tasks, audits data quality, and pays rewards through smart contracts, attracting global labelers to join and ensuring the transparency of labeled data and privacy protection.

The economic model of the project token is as follows:

Community rewards: By participating in data labeling and analysis, users can receive $OORT token rewards. Additionally, they may also receive unique NFTs linked to their contributions, which provide extra benefits such as rewards for increasing annual percentage yield (APY), device discounts, and DAO voting rights.

Task collateral: Participants must collateralize at least 210 $OORT tokens to demonstrate their commitment to the task. After completing the task, the tokens will be returned, and rewards will be distributed.

Sales revenue sharing: Some NFT holders can also receive dividends from future data sales revenue, further enhancing long-term returns.

  • PublicAI: An AI ecosystem project on the Solana chain, aims to connect data demanders with global labelers, rewarding participants through a cryptocurrency incentive mechanism, while utilizing blockchain technology to record the details of the labeling process, ensuring data security and privacy.

The economic model of the project token is as follows:

Community rewards: 10% of Public tokens will be used for airdrop rewards for user interactions in the early stages. Specifically, there are three methods to obtain airdrops: Become an AI Builder: Collect high-quality internet content; become an AI Validator: Validate collected content; become an AI Developer: Use verified datasets to train AI agents.

Token distribution: The project completed a $2 million seed round of financing in January 2024, with investors including IOBC Capital, Foresight Ventures, Solana Foundation, Everstate Capital, and several renowned professors and academicians in the field of artificial intelligence. The specific details of PublicAI token distribution have not yet been clarified.

Challenges faced

Currently, several factors are constraining the development of this track: First, AI data labeling requires high computational and storage resources; second, project performance is limited by blockchain scalability; third, technical standardization and regulation are still inadequate.

Among them, the second point may be the biggest challenge currently faced. Because AI data labeling and model training typically require significant computational resources, while the computing power of nodes in blockchain networks is limited. How to effectively integrate and utilize distributed computing resources to meet the computational needs of AI data labeling projects while ensuring the decentralized nature of blockchain is an urgent issue to be resolved. It is reported that Binance's Greenfield is providing storage support for this track, and we hope to see more storage and computational resources put into practice in this field.