Learn the basics of Substreams, a powerful blockchain data indexing solution.

TL;DR: For those new to Substreams, learn what it is and how it makes blockchain data more accessible to developers.

Are you looking for the best blockchain data indexing solution? Want to know how to efficiently extract and manage blockchain data?

Get started with Substreams!

This is the first in a series of articles that will take you from novice to mastery in Substreams.

The problem of accessing blockchain data

Developers often find it challenging to build data-centric applications, especially when faced with blockchain data. Extracting blockchain data is difficult and complex, and due to the linear and distributed nature of blockchain, it is even more challenging to extract data in a fast and reliable manner.

Substreams are the solution

There aren’t many solutions to this problem right now, but StreamingFast, a specialist in building blockchain data processing tools, is stepping up its efforts. They are using a new technology called Substreams to make it easier to process and index blockchain data quickly and reliably.

Let’s take a look at what Substreams are and how they make blockchain data more accessible.

What is a substream?

Substreams is a powerful blockchain data indexing technology built and developed by StreamingFast for The Graph Network. It enables developers to extract data from the blockchain, apply custom transformations to meet the unique needs of their applications, and effortlessly direct the processed data to a variety of destinations such as PostgresSQL, ClickHouse, MongoDB, and more.

How do substreams work?

Substreams involves two main components: the Substreams provider and the Substreams package. Let's take a closer look at each:

  • Substreams providers: Substreams providers store and deliver blockchain data. These providers, such as Pinax, use Firehose, a blockchain-agnostic, high-performance data ingestion engine developed by StreamingFast, to efficiently ingest blockchain data.

  • Substreams package: The Substreams package is a small Rust program compiled into WebAssembly that defines the transformations that the developer wants to apply to the data. The developer sends the Substreams package to the Substreams provider using a gRPC request, which then executes the request and streams back the transformed data. Additionally, the developer can send the data to other destinations as needed.

Currently, Substreams can only be built using Rust, but the StreamingFast team plans to enable developers to build Substreams in Golang and TypeScript in the near future.

Three ways to use Substreams

Developers have different choices when dealing with subflows: they can use pre-built subflows or build their own:

  1. Using Substreams: The easiest way to leverage Substreams is to use pre-built Substreams packages available on the Substreams Registry, a one-stop destination for discovering and sharing Substreams packages. You can choose the package that meets your needs and stream data seamlessly to your preferred destination.

  2. Building Substreams: If you can't find a suitable Substreams package in the Substreams registry, you can create your own package. After development, you can publish these packages to the registry to make them available to others.

  3. Extending Substreams: You can also leverage existing Substreams modules in the registry and build new Substreams modules on top of them, generating entirely new datasets. This approach allows customizing and extending Substreams functionality to meet specific requirements.

This collaborative approach fosters a vibrant ecosystem where developers can contribute their solutions and benefit from the collective knowledge and innovation within the community.

Benefits of using substreams

Substreams provide developers with many advantages when indexing and querying blockchain data. Here are some of them:

  • Speed: Substreams prioritizes speed through a parallel architecture and stream-first design, ensuring efficient blockchain data indexing.

  • Composability: Subflows provide composability, enabling developers to easily use each other's code or modules to create complex indexing pipelines.

  • Reusable: Substreams emphasizes reusability, enabling you to use pre-built substreams available on the Substreams registry to perform its indexing tasks.

  • Custom sinks: Substreams supports custom sinks, allowing for seamless integration with your preferred data storage or analytics solution.

  • Offloading blockchain data indexing to providers: Substreams allows you to offload the heavy lifting of blockchain indexing to service providers like Pinax. Providers can scale based on requests and sink data into various databases, alleviating the need to run expensive indexing nodes yourself.

  • Strong community support: Despite being a new technology, Substreams has already attracted a lot of attention from developers, and the number is steadily growing. At Pinax, in addition to the StreamingFast Discord community, we have another Discord community that can provide you with support and help if you want to use the Substreams technical solution.

Learn and explore more blockchain data indexing technologies

Follow us on WeChat: Pinax