Blog

On-Chain Data Series II: Streamlining Blockchain Data Output and Distribution

Welcome to the second part of our On-Chain Data Series. In this segment, we explore our approach in offering users fast, reliable, and straightforward access to both the latest and historical blockchain data. This includes our integration with Nethermind and Geth nodes, as well as leading blockchain data providers such as QuickNode.

  • February 27, 2024
  • Vlad Cealicu

Given the rapid pace of the digital asset sector, the ability to quickly access and analyse data can make the difference between seizing an opportunity and missing it. At CCData, we've developed a sophisticated pipeline not just for ingesting data but also for processing, storing, and distributing it. This is crucial for delivering actionable blockchain insights to our users.

Data Access

For immediate access, the latest blocks—often the most queried data—are stored in Redis. This in-memory database provides sub-millisecond response times, making it an ideal solution for real-time applications such as live trading platforms, where having access to the latest transaction data is critical.

By caching the most recent blocks in Redis, we ensure that requests for current blockchain states can be serviced instantly. This setup supports a range of use cases:

  • Trading algorithms that need the latest block information to make split-second decisions.
  • Risk analysis systems that monitor transactions in real-time to detect fraudulent activity.
  • Smart contract developers who require up-to-date information for dApp functionality.

Full Blockchain History

While the latest data is crucial, historical analysis is where long-term strategies are built. By uploading our data to Cloudflare R2, we ensure that every block—from the genesis to the latest—is stored indefinitely.

Our approach to uploading data to R2 involves:

  • Compression: All data files are gzipped to minimise storage space and facilitate quicker data transfer.
  • Batch Processing: We accumulate blocks of data and upload them in batches, thereby reducing the number of write operations and optimising resource usage.
  • Data Lifecycle: The blocks are methodically moved from high-availability Redis to durable storage in R2, ensuring data longevity and integrity.

This historical data can be used for:

  • Trend Analysis: Understanding long-term blockchain trends, such as average transaction fees, typical block sizes, and the evolution of smart contract deployment.
  • Backtesting: Traders and developers can backtest strategies or smart contract interactions against historical data to ensure robustness.
  • Academic Research: Scholars can study the historical growth and usage patterns of blockchain technology.

Streaming Data for Maximum Utility

Real-time data streaming is also an integral part of our architecture. This system allows for the continuous flow of information and enables immediate reaction to new data. Connecting to our websocket streaming servers enables the following use cases:

  • Financial institutions can track blockchain transactions as they happen, integrating this data into their risk assessment models.
  • Market analysts can monitor the flow of digital assets in real time, gaining insights into market sentiment and momentum.
  • Regulatory bodies can use the streamed data to ensure compliance with anti-money laundering (AML) regulations.

We understand that different blockchains can have vastly different data structures. To tackle this, we standardise the data format for blocks across all supported blockchains like Ethereum and Binance Smart Chain (BSC). This standardisation includes:

  • Metadata: Block height, timestamp, miner information, and more.
  • Price Information: Real-time value of the native blockchain currency at the time of block mining.
  • Block Times: The intervals at which blocks are mined, which can be vital for analysing network health and congestion.
  • Transactions: Every transaction within a block is recorded, complete with associated events and traces.

Unified Endpoint for Seamless and Reliable Access

At CCData, our mission is to simplify and enhance the way users interact with blockchain data. Our unified API endpoint embodies this mission by offering a single access point for both historical and latest blockchain data, ensuring an efficient and user-friendly experience. To further guarantee the reliability and accuracy of the data we provide, our system intelligently selects the most up-to-date and dependable dataset from our integrations with Nethermind, Geth, and QuickNode. This process includes several key features:

  • Intelligent Data Selection: Our backend algorithms evaluate the data from our Nethermind node, Geth node, and QuickNode in real-time to determine the most current and accurate information. This ensures that users always receive the best possible data for their queries.
  • Graceful Fallback Mechanisms: In the event that one data source encounters issues, our system automatically falls back to alternate sources without any interruption in service. This redundancy ensures continuous access to data, maintaining sub-100 ms response times for historical data (coming from R2 blobs) and sub-20 ms for the latest block information (available in our Redis Cluster)
  • Seamless User Experience: Despite the complexity of managing multiple data sources and ensuring the highest data integrity, this entire process is transparent to the user. The unified API endpoint delivers a smooth and efficient experience, allowing users to access any block data quickly, without needing to understand the underlying data source selection mechanism.

Looking Forward:

CCData's approach to blockchain data distribution is designed to provide users with fast, reliable, and straightforward access to both historical and latest data through a single, unified endpoint. Our integration with our own Nethermind and Geth nodes as well as leading blockchain data providers like QuickNode, coupled with our advanced data selection algorithms and fallback mechanisms, ensures that our users always have access to the highest quality data. This level of service is crucial for supporting real-time decision-making and comprehensive historical analysis across a wide range of blockchain applications.

In our commitment to pushing the boundaries of blockchain data services, we continuously refine our processes and technologies. The next installment in our blog series will take a closer look at how our API and streaming services are specifically applied to process Uniswap V3 swaps and liquidity updates, highlighting our role in advancing the usability and analysis of decentralised finance (DeFi) data. Stay tuned to discover more about our innovative solutions and how CCData is leading the way in blockchain data management and distribution.



If you’re interested in learning more about CCData’s market-leading data solutions and indices, please contact us directly.

On-Chain Data Series II: Streamlining Blockchain Data Output and Distribution

Given the rapid pace of the digital asset sector, the ability to quickly access and analyse data can make the difference between seizing an opportunity and missing it. At CCData, we've developed a sophisticated pipeline not just for ingesting data but also for processing, storing, and distributing it. This is crucial for delivering actionable blockchain insights to our users.

Data Access

For immediate access, the latest blocks—often the most queried data—are stored in Redis. This in-memory database provides sub-millisecond response times, making it an ideal solution for real-time applications such as live trading platforms, where having access to the latest transaction data is critical.

By caching the most recent blocks in Redis, we ensure that requests for current blockchain states can be serviced instantly. This setup supports a range of use cases:

  • Trading algorithms that need the latest block information to make split-second decisions.
  • Risk analysis systems that monitor transactions in real-time to detect fraudulent activity.
  • Smart contract developers who require up-to-date information for dApp functionality.

Full Blockchain History

While the latest data is crucial, historical analysis is where long-term strategies are built. By uploading our data to Cloudflare R2, we ensure that every block—from the genesis to the latest—is stored indefinitely.

Our approach to uploading data to R2 involves:

  • Compression: All data files are gzipped to minimise storage space and facilitate quicker data transfer.
  • Batch Processing: We accumulate blocks of data and upload them in batches, thereby reducing the number of write operations and optimising resource usage.
  • Data Lifecycle: The blocks are methodically moved from high-availability Redis to durable storage in R2, ensuring data longevity and integrity.

This historical data can be used for:

  • Trend Analysis: Understanding long-term blockchain trends, such as average transaction fees, typical block sizes, and the evolution of smart contract deployment.
  • Backtesting: Traders and developers can backtest strategies or smart contract interactions against historical data to ensure robustness.
  • Academic Research: Scholars can study the historical growth and usage patterns of blockchain technology.

Streaming Data for Maximum Utility

Real-time data streaming is also an integral part of our architecture. This system allows for the continuous flow of information and enables immediate reaction to new data. Connecting to our websocket streaming servers enables the following use cases:

  • Financial institutions can track blockchain transactions as they happen, integrating this data into their risk assessment models.
  • Market analysts can monitor the flow of digital assets in real time, gaining insights into market sentiment and momentum.
  • Regulatory bodies can use the streamed data to ensure compliance with anti-money laundering (AML) regulations.

We understand that different blockchains can have vastly different data structures. To tackle this, we standardise the data format for blocks across all supported blockchains like Ethereum and Binance Smart Chain (BSC). This standardisation includes:

  • Metadata: Block height, timestamp, miner information, and more.
  • Price Information: Real-time value of the native blockchain currency at the time of block mining.
  • Block Times: The intervals at which blocks are mined, which can be vital for analysing network health and congestion.
  • Transactions: Every transaction within a block is recorded, complete with associated events and traces.

Unified Endpoint for Seamless and Reliable Access

At CCData, our mission is to simplify and enhance the way users interact with blockchain data. Our unified API endpoint embodies this mission by offering a single access point for both historical and latest blockchain data, ensuring an efficient and user-friendly experience. To further guarantee the reliability and accuracy of the data we provide, our system intelligently selects the most up-to-date and dependable dataset from our integrations with Nethermind, Geth, and QuickNode. This process includes several key features:

  • Intelligent Data Selection: Our backend algorithms evaluate the data from our Nethermind node, Geth node, and QuickNode in real-time to determine the most current and accurate information. This ensures that users always receive the best possible data for their queries.
  • Graceful Fallback Mechanisms: In the event that one data source encounters issues, our system automatically falls back to alternate sources without any interruption in service. This redundancy ensures continuous access to data, maintaining sub-100 ms response times for historical data (coming from R2 blobs) and sub-20 ms for the latest block information (available in our Redis Cluster)
  • Seamless User Experience: Despite the complexity of managing multiple data sources and ensuring the highest data integrity, this entire process is transparent to the user. The unified API endpoint delivers a smooth and efficient experience, allowing users to access any block data quickly, without needing to understand the underlying data source selection mechanism.

Looking Forward:

CCData's approach to blockchain data distribution is designed to provide users with fast, reliable, and straightforward access to both historical and latest data through a single, unified endpoint. Our integration with our own Nethermind and Geth nodes as well as leading blockchain data providers like QuickNode, coupled with our advanced data selection algorithms and fallback mechanisms, ensures that our users always have access to the highest quality data. This level of service is crucial for supporting real-time decision-making and comprehensive historical analysis across a wide range of blockchain applications.

In our commitment to pushing the boundaries of blockchain data services, we continuously refine our processes and technologies. The next installment in our blog series will take a closer look at how our API and streaming services are specifically applied to process Uniswap V3 swaps and liquidity updates, highlighting our role in advancing the usability and analysis of decentralised finance (DeFi) data. Stay tuned to discover more about our innovative solutions and how CCData is leading the way in blockchain data management and distribution.



If you’re interested in learning more about CCData’s market-leading data solutions and indices, please contact us directly.

Stay Up To Date

Get our latest research, reports and event news delivered straight to your inbox.