Author: Paul Veradittakit, Partner at Pantera Capital

Compiled by: xiaozou, Golden Finance

Sahara AI's mission is to create a more open, fair, and collaborative AI economy, making it as easy as possible for people to participate. By leveraging blockchain, Sahara ensures that all contributors (data contributors, labelers, model developers, etc.) receive fair compensation, data and models retain sovereignty, AI assets are secure, and permissions can be created, shared, and traded.

1. Current State of AI Stack

The current AI stack can be divided into the following layers:

Data Collection and Labeling

Data is collected from various sources (e.g., web scraping, public datasets, user-generated data) and must adhere to licensing requirements to avoid legal issues. Data is labeled according to the task at hand (e.g., classification, object recognition).

Model Training and Services

Data is input into the model, which adjusts its internal parameters (weights) to minimize error. This requires quite expensive and time-consuming computations.

Creation and Deployment of AI Entities

Creating AI entities typically involves using tools like TensorFlow, requiring technical expertise.

Computational Resources

Model training requires expensive processing.

Every layer is competitive and diverse, and largely, one execution method has proven to be the most effective. For example, data collection is best done using large public datasets (like books) and fine-tuning with specialized data (research papers). Model training is best performed on specialized hardware, and AI entities should easily utilize plug-and-play resources to build a developer community, with computational resources being distributed to ensure accurate rewards for resource providers. These elements combined will lead to better AI models and a stronger community.

Web2 companies are working in this direction, but they face severe limitations due to their centralized designs. From both the enterprise and technology perspectives, these companies aim to restrict access and isolate the different parts of the stack, leading to different security standards, database designs, backend integration, and monetization strategies. In practice, such designs are poor and cannot cope with the transition of the AI economic model.

For example, OpenAI has built a very powerful foundational model and has begun attracting community builders through its permissionless GPT wrapper market, but it only allows superficial prompt customization and does not support the reconstruction of the underlying model. All of the company’s computational resources were purchased with investor money, and it is expected to lose $5 billion by the end of this year.

2. AI Collaborative Economy

The Sahara platform provides a one-stop service for all AI development needs throughout the entire AI lifecycle: from data collection and labeling, to model training and services, creation and deployment of AI entities, multi-agent communication, AI asset trading, and crowdsourcing of AI resources. By democratizing the AI development process and lowering the entry barriers of existing systems, Sahara AI provides equal access for individuals, businesses, and communities to collaboratively build the future of AI.

Pantera合伙人:AI原生团队、豪华投资阵容,全面解析Sahara AI

The above diagram summarizes the user journey, illustrating how AI assets progress from creation to usage to achieve user stickiness within the Sahara AI ecosystem. It is noteworthy that all transactions within the platform are immutable and traceable, ownership is protected, and asset provenance is recorded. This supports a transparent and fair revenue-sharing model, ensuring that developers and data providers receive appropriate compensation for generating revenue.

Sahara aims to make it easier for people to engage in the AI economy. Developers and users can use Sahara as follows:

Experienced AI Developers:

Developers can interact with any layer of the Sahara blockchain and its AI stack using the Sahara SDK and API, such as personalized computing power, data storage, and incentive structures to form their own Sahara AI entities, which can be authorized and monetized for others to use.

AI Development Novice:

Through no-code/low-code environments, developers can create and deploy AI assets using intuitive interfaces and pre-built templates.

AI Training:

To participate in AI model training, users simply need to visit a website where they can complete AI training tasks and receive compensation in tradeable tokens, with tasks ranging from solving basic math problems to describing short videos.

AI Users:

Users can easily interact with AI entities through an intuitive UI. They can flexibly purchase access and further development permissions and even trade AI asset shares.

Users will be able to create their own personalized data 'knowledge base' and use their own data to create specialized AI. Like other AIs, this will allow others to access it, while the training data remains completely private and secure.

Pantera合伙人:AI原生团队、豪华投资阵容,全面解析Sahara AI

Company:

Companies can also create AI entities (or 'business agents') and train them with their proprietary data. Because the system operates on the Sahara blockchain, the costs are significantly lower due to decentralized AI entity generation and services.

Businesses can also pay to generate Sahara data, which integrates AI auto-labeling and human labeling to effectively create high-quality, privacy-protected multi-model datasets.

Aside from enterprise-facing products already being used by some well-known clients, all other functionalities have not yet been released, but there are release plans in place.

3. Technical Overview

Pantera合伙人:AI原生团队、豪华投资阵容,全面解析Sahara AI

The Sahara team has designed the system to be as simple and user-friendly as possible, abstracting the complexities needed to ensure compatibility, profitability, and security across the various parts of the AI stack. Behind the scenes, the Sahara team has developed countless innovations to achieve this goal. Here are a few examples:

  • The Sahara blockchain minimizes gas fees, is fully compatible with EVM, and the Sahara Cross-Chain Communication (SCC) protocol enables secure, permissionless data transfer across blockchains, facilitating trustless interoperability.

  • Sahara AI-Native Precompiles (SAPs) are pre-compiled smart contracts used to optimize the performance of AI tasks and reduce computational overhead, including training execution SAPs and inference execution SAPs.

  • Sahara Blockchain Protocols (SBPs) manage AI assets to ensure accounting accountability, such as AI attribution tracking contributions and distributing rewards, AI Asset Registry to manage the registration and provenance of AI assets, AI licenses, and AI ownership.

  • Data management is performed both on-chain and off-chain, with AI asset metadata, commitments, and proofs on-chain, while important datasets, AI models, and supplementary information are handled off-chain to optimize data retrieval, security, and availability.

  • Collaborative Execution Protocols support the joint development and deployment of AI models across AI training, aggregation, and services. Other models like PEFT allow for technical fine-tuning, and Privacy Preserving Compute supports differential privacy, homomorphic encryption, and secret sharing, with capabilities for Fraud Proofs as the name suggests.

4. Fully Integrated AI Stack

The team is led by Sean Ren, a tenured professor at the University of Southern California, and Tyler Z, an alumnus of the University of California, Berkeley. The former was named one of the 35 Innovators Under 35 by MIT Technology Review and was awarded the 2023 Samsung Researcher of the Year, while the latter previously served as the Director of Investments at Binance Labs. Other team members come from backgrounds or experiences at companies such as Stanford University, the University of California, Berkeley, AI2, Toloka, Stability AI, Microsoft, Binance, Google, Chainlink, LinkedIn, and Avalanche, contributing valuable expertise.

Sahara also has top AI native researchers and enterprise clients providing advice:

  • Laksh Vaaman Sehgal (Vice Chairman of Motherson Group)

  • Rohan Taori (Human Research Scientist)

  • Teknium (Co-founder of Nous Research)

  • Vipul Prakash (CEO of Together AI)

  • Elvis Zhang (Founding Member of Midjourney)

Sahara AI is currently used by over 35 leading tech innovation projects and research institutions, including Microsoft, Amazon, MIT, Motherson Group, and Snap, for various AI services such as Shara Data for data collection/labeling and Sahara Agents for personalized domain intelligence.

Generative AI is still in its infancy in terms of technology and market scale; due to the difficulty of integrating the entire AI stack into a single product, today's centralized chat and video tools have limited reach. Sahara AI is the only company addressing this bottleneck through modular design, using blockchain as the pillar for permissionless access, token distribution, and security. To allow everyone to participate, the future of AI must be accessible and equitable, and Sahara AI is the only company moving toward this vision.