Data Sources and Processing

Stereotic currently aggregates Binance trading data.

The aggregated data is provided “as is” without any warranty or guarantee of any kind, either express or implied, including, but not limited to, the implied warranties of merchantability, fitness for a particular purpose, or non-infringement. The creators of this dataset make no representations or warranties regarding the accuracy, completeness, or reliability of the data. Users assume all risks associated with the use of this dataset and are solely responsible for all outcomes, including any damages, errors, or consequences arising from its use.

Circo

Stereotic aggregates crypto data using a scalable framework called Circo. Circo not only enables fast and efficient processing of huge amounts of data, but it can scale to arbitrary distributed environments (aimed at AI workloads).

HC data format

Stereotic uses a specialized data format called “hc” for candlestick aggregations, which is a simple binary format: Check the example python notebook on kaggle for reference.

Data API

The Stereotic Explore uses a simple API for top 100 coin market data. Two of the endpoints used by the app can be fetched directly.

Top 100 asset statistics

The current top 100 asset list with up-to-date statistics can be retreived using: https://stereotic.com/data/stats/top100_stat.json

Returns: a JSON document cointaining basic statistics, rank with symobol ID and name.

Aggregated candlestick data

This endpoint allows users to retrieve aggregated data based on the specified parameters, enabling flexible and detailed data analysis for various timeframes and assets.

https://stereotic.com/data/aggregation/[length (window size) sec]/[reference id]/[candle width (sec)]/[start utc timestamp]/[ID]

Parameters:

  • length: The duration of the aggregation window in seconds. Possible values are:
    • 3600 (1 hour)
    • 60 (1 minute)
    • 1200 (20 minutes)
    • 400 (6 minutes and 40 seconds)
    • c
  • reference id: A unique identifier for the reference data. Possible value:
    • only data source 6 is available currently
  • candle width: The width of each candle in seconds. Possible values are:
    • 60 (1 minute)
    • 300 (5 minutes)
    • 900 (15 minutes)
    • 3600 (1 hour)
    • 14400 (4 hours)
    • 86400 (1 day)
  • start utc timestamp: The start time for the data in UTC timestamp format. Examples for different length values:
    • for length = 3600 candles start from 1678320000 (March 09, 2023 00:00:00 GMT)
    • for length = 60 candles start from 1710936000 (March 20, 2024 12:00:00 GMT)
  • ID: The asset ID representing the specific asset being queried, as returned by the asset list endpoint.

Returns:

A .hc file with the detailed price and volue statistics of the asset. To decode .hc files please refer to the example notebook on kaggle.

Currently 3600 and 60 are publicly available as length, and source 6.

Example Usage: To fetch aggregated data for a 1-hour window, reference ID 6, 5-minute candle width, starting from March 09, 2023 00:00:00 GMT: https://stereotic.com/data/aggregation/3600/6/300/1690200000/1.hc