All Blockchain Explorers Work With The Same Data — Or Do they?

“Oh, I simply download the data from the explorer.” — That’s a sentence I often hear when talking to others in the Web3 space while asking them where they get their data from. But these people must realize that the data a block explorer shows is only a perspective of the actual blockchain. And there is no guarantee that this view is complete or correct. Here is an example:

This morning, I was playing around with our KYVE Data Warehouse and was curious about how many KYVE addresses there are. Since an address only appears on the chain when it has been part of a transaction, I first looked at all the transfers. For an address to make a transaction, some $KYVE tokens for gas are needed, so an address must in any way receive tokens first.
At KYVE, we have a transformed table in our Data Warehouse that lists all the transfer events with a sender and a recipient. I wrote a simple query that would return to me all unique addresses that were ever involved in a transfer (sender or recipient): 12730 addresses. So far, so good. That made sense to me. Curious, I checked out my personal favorite explorer for KYVE, Viewblock, but I saw a number that confused me on the addresses section: 419 pages with 25 addresses each and one with 22. That’s…. 10497. Oh no! That’s far off what I was expecting. Was my query wrong?

Next, I headed to Mintscan, which showed 11917 addresses. Nice! Three times the same question, three different numbers, so who is right?

When checking the on-chain API, we get a fourth number: 12739. The on-chain API hits a node within the network directly, so we know that this number is the most trustworthy. The number this API call returns lists all addresses registered in the Auth Module from the chain. That’s very close to the one we get from our data warehouse (our warehouse also lags a couple of hours behind). The slight difference is caused by Cosmos SDK Modules. While some of them might not have been involved in any transaction yet, they are already installed and have a queryable address on the chain.

So the difference in the number of addresses is not caused by any lost data or modifications KYVE is doing. And we will most likely be seeing little differences between different chains. However, the fact that no one is comparing data between explorers and other data applications shows that lost data is an actual problem and that a solution that ensures data integrity is needed before storing it.

Conclusion

Block explorers are genuinely fantastic tools for the everyday user. Sites like Viewblock and Mintscan offer a look into the blockchain without the need for deep technical know-how.

Building an explorer is no easy task, and it’s highly probable that some data might slip through the cracks with the billions of transactions in transit.

However, when you need data you can stake important decisions and critical systems on, using validated data sources like KYVE Data Pools becomes absolutely necessary.

Think of Block Explorers as your trusty compass: great for pointing you in the right direction, but when you’re in a place where you can’t get lost (like filing your taxes), you’ll want the precision of a GPS — that’s your KYVE Data Pool. And let’s not forget to tip our hats to the explorers; they make the journey a whole lot easier for everyone.

Note: All the data is from the 14.02.2024