Process Crypto Market Data Using Time Series

What if you could store time series data side by side Your “normal” data without Overhead for a separate database?

RavenDB is a NoSQL document database offering Sweetened Storage of time series. It’s as if MongoDB has InfluxDB built in! Batteries included.

This 5-minute video explains how RavenDB’s time series support compares to existing products:

The video features ingesting market data using a price chart similar to the popular Robinhood trading app. This article reviews how to build the typical application and should give you an idea of ​​what it’s like to work with time series data in RavenDB.

What is time series data?

“Time series” are typically characterized as time-indexed data points, and are usually high-frequency in nature. Common sources for time series are IoT devices, infrastructure, analytics, and financial data.

Graphs on PC

What makes it different?

At first, you might be thinking, “Can you store these data points within a document database or as rows in a relational database?”

cipher code

You are can But this will not scale. Storing such data will quickly eat up your storage space and make your client code do more work to collect the data.

Before the time series supported by RavenDB, a product like InfluxDB would be needed which would make the architecture (and applications) landscape more complex.

Instead, RavenDB’s native time series support addresses each of these issues interchangeably with:

  1. Atomic transactions coordinated across a set
  2. Rule-based retention policies and storage optimizations
  3. Server-side querying and aggregation with indexes

Time series are document ‘extensions’, so they take advantage of all the infrastructure built around document storage and query which makes it easier to work with and comes with built-in optimizations.

Working with market data

I created a demo app that shows time series support in action by tracking the bitcoin trading price, inspired by the popular trading app Robinhood.

All code samples are available on GitHub in C# and Node.js formats. A sample database is provided that you can import into your RavenDB instance (or try it for free with RavenDB Cloud).

Market data is sourced from KuCoin, a crypto exchange. There are 3 parts to experimental geometry:

  • swallow: Backend function that ingests data from KuCoin to RavenDB
  • the background: HTTP endpoint queries for data from RavenDB
  • end of introduction: a web interface that displays the interactive coding scheme

These instructions assume that you have some beginner level knowledge of working with RavenDB and show how to work with time series data using Studio and language SDKs. If you haven’t worked with RavenDB before, this self-guided bootcamp covers all the prerequisites outlined here!

Time Series Management in RavenDB Studio

We’ll start by looking at how time series works in the studio interface.

Add time series to documents

In the sample database there is a document representing the market token of Bitcoin (BTC-USDT):

BTC-USDT code

As you can see, there is no time series data in the document itself.

Configure time series groups

Alternatively, time series can be added through the Document Studio sidebar:

Market tokens in coding infrastructure

There is no time series without values. Once a value is appended, the time series is ‘created’. Once the last value is deleted, the time series disappears.

Time series entries consist of a string name along with a timestamp and each value:

Enter time series to get market symbols screenshot

The first four values ​​are the name of the thing (Open, Close, High, Low) but the last value is numerically identified as “Value #4”. By default, the values ​​are Unnamed It is accessed via the index but you can optionally configure the named values.

Add Named Value Configuration

Naming the values ​​is a best practice because it makes the code more explicit and less prone to errors.

Inside the studio under Settings And Configure time series, RavenDB provides an interface to manage the configuration for a time series based on document collection:

The market symbol is called the value

Each name is associated with the value index of the entry and the name of the time series.

Use of time series APIs

Managing time series data in studio is a first class experience but you will mostly interact with time series data using code.

Append and update time series data

Time series data from the KuCoin API is ingested with a Node.js TypeScript backend function hosted in Azure functionality.

Once a RavenDB session is opened, the time series APIs run on the documents. First, the Bitcoin document is loaded by the ID:

let symbolDoc = await session.load(`MarketSymbols/${marketSymbol}`);

if (!symbolDoc) {
  symbolDoc = {
    symbol: marketSymbol,
    "@metadata": {
      "@collection": "MarketSymbols",
    },
  };
  await session.store(symbolDoc, `MarketSymbols/${marketSymbol}`);
}

Since the time series must be linked to a document, we need to create it if it doesn’t exist before using a file timeSeriesFor API.

const timeSeries = session.timeSeriesFor(symbolDoc, "history");

The API time series takes the entity (document) and the name of the time series collection. this is No No data uploaded (yet).

The KuCoin API displays data between two dates as “candles”, visualized as:

Investopedia chart for low price open high close

credit: Investopedia

KuCoin represents each candle in the form of a group, where each indicator corresponds to the value of the candle.

The code iterates through candles and uses array destruction in TypeScript to take the values ​​and append them to the RavenDB time series:

for (const bucket of buckets) {
  const [
    startTime, 
    openPrice, 
    closePrice, 
    highPrice, 
    lowPrice
  ] = bucket;

  const timestamp = dayjs.unix(startTime);

  timeSeries.append(
    timestamp.toDate(),
    [openPrice, closePrice, highPrice, lowPrice]
  );
}

the append The method takes the timestamp (using the dayjs helper library) and the array of values ​​(stored by the index).

await session.store(symbolDoc, `MarketSymbols/${marketSymbol}`);
await session.saveChanges();

Like other document-based operations, time series changes are not even executed saveChanges() Called. Time series updates from multiple clients across a set do not cause conflicts because RavenDB uses conflict resolution semantics automatically.

Once the background function has ingested the data in RavenDB, we can access it and return the data to the web interface to generate the graph.

Passionately fetching the time series

The front end calls a REST endpoint with query parameters to retrieve a JSON representation and build the histogram using the Apex Charts library. The backend is written in C# and uses the .NET RavenDB SDK.

When the customer requests the market code, the code loads the document with the identifier:

var symbol = await _session.LoadAsync<MarketSymbol>(
  $"MarketSymbols/{marketSymbol}", includes => 
    includes.IncludeTimeSeries("history", 
      from: DateTime.UtcNow.AddDays(-1), to: DateTime.UtcNow));

the IncludeTimeSeries Which API is used? eagerly brings Time series data when the document is loaded by RavenDB. This reduces network calls to the database and caches the time series in the session. the from And to The arguments allow us to prevent the entire data set from being loaded into memory.

Once the document is loaded, we can use the session TimeSeriesFor API to access the time series:

var historyTimeSeries = _session.TimeSeriesFor<SymbolPrice>(symbol, "history");

Note that we pass a general argument with a kind of SymbolPrice. This is the structure that writes values ​​strongly.

Added value named support with categories

I showed you how to manually configure the named values ​​in studio but we can do the same in code. This allows you to keep your application code as a source of truth.

Create a structure for naming time series values:

public struct SymbolPrice
{
    [TimeSeriesValue(0)] public double Open;
    [TimeSeriesValue(1)] public double Close;
    [TimeSeriesValue(2)] public double High;
    [TimeSeriesValue(3)] public double Low;
}

TimeSeriesValue It takes the value entry index to associate it with the property.

Now that DocumentStore is configured, register it with collection type and string name:

store.Initialize();
store.TimeSeries.Register<MarketSymbol, SymbolPrice>("history");

Note: This feature is only available for statically typed language SDKs such as .NET and Java.

Loading raw time series entries

Once the document is loaded, we can get the latest chronology of entries by date:

var historyTimeSeries = _session.TimeSeriesFor<SymbolPrice>(symbol, "history");

var latestEntries = await historyTimeSeries.GetAsync(
  from: DateTime.UtcNow.AddDays(-1), to: DateTime.UtcNow);

without using a file IncludeTimeSeries Hint, GetAsync This may result in an additional network call to the database. Instead, the data is loaded from the session cache.

Document loads are never outdated, so we can use the most recent entry to return the last traded price:

var latestEntry = latestEntries.LastOrDefault();

viewModel.LastUpdated = latestEntry?.Timestamp;
viewModel.LastPrice = latestEntry?.Value.Close ?? 0;

Query and collection of time series data

The next step in creating the histogram is to aggregate data based on time windows, such as “past day” or “last week.”

In the traditional database or application, we will have to load The entire data set and inventory data manually, which will force us to load all the time series into memory.

In RavenDB, time series data is indexed and can be aggregated and filtered on the database server for return to the client. This takes advantage of everything else that indexes have to offer.

Use the Session Query API to query the collection:

var aggregatedHistoryQueryResult = await _session.Query<MarketSymbol>()
  .Where(c => c.Id == symbolId)

Then you can use the helper functions from RavenQuery.TimeSeries To create a query expression for a time series:

var aggregatedHistoryQueryResult = await _session.Query<MarketSymbol>()
  .Where(c => c.Id == symbolId)
  .Select(c => RavenQuery.TimeSeries<SymbolPrice>(c, "history")
    .Where(s => s.Timestamp > fromDate)
    .GroupBy(groupingAction)
    .Select(g => new
    {
      First = g.First(),
      Last = g.Last(),
      Min = g.Min(),
      Max = g.Max()
    })
    .ToList()
  ).ToListAsync();

There are two variables that we pass to build the time series query: fromDate And groupingAction.

the fromDate It is calculated based on the required time window displayed by the frontend:

var marketTime = GetMarketTime();
var fromDate = aggregation switch
{
  AggregationView.OneDay => marketTime.LastTradingOpen,
  AggregationView.OneWeek => DateTime.UtcNow.AddDays(-7),
  AggregationView.OneMonth => DateTime.UtcNow.AddMonths(-1),
  AggregationView.ThreeMonths => DateTime.UtcNow.AddMonths(-3),
  AggregationView.OneYear => DateTime.UtcNow.AddYears(-1),
  AggregationView.FiveYears => DateTime.UtcNow.AddYears(-5)
};

The code defines the market open/close schedule (naively) and then can return the appropriate date for filtering from the query.

the groupingAction It is ultimately what stores (groups) of time-series data points and is based on the time window being displayed. the ITimePeriodBuilder The API can help build the correct query aggregate expression based on simple units of time:

Action<ITimePeriodBuilder> groupingAction = aggregation switch
{
  AggregationView.OneDay => builder => builder.Minutes(5),
  AggregationView.OneWeek => builder => builder.Minutes(10),
  AggregationView.OneMonth => builder => builder.Hours(1),
  AggregationView.ThreeMonths => builder => builder.Hours(24),
  AggregationView.OneYear => builder => builder.Hours(24),
  AggregationView.FiveYears => builder => builder.Days(7),
};

This assembly procedure will tell RavenDB how to collect the data before returning it by translating it internally into Raven Query. This offloads all processing to the database server.

Once we have the queried data, we build our model and assign price values ​​to each group:

var historyBuckets = new List<MarketSymbolTimeBucket>();
foreach (var seriesAggregation in aggregatedHistory.Results)
{
  historyBuckets.Add(new MarketSymbolTimeBucket()
  {
    Timestamp = seriesAggregation.From,
    OpeningPrice = seriesAggregation.First.Open,
    ClosingPrice = seriesAggregation.Last.Close,
    HighestPrice = seriesAggregation.Max.High,
    LowestPrice = seriesAggregation.Min.Low,
  });
}

Using the named value structure (SymbolPrice), we can access the aggregated values ​​for each pricing group. For example, the opening price will be the first price of the data point Open The value in the group and the closing price is the last data points Close the value.

This model representation is then returned to the client as JSON and the schema can be generated from these data points.

conclusion

RavenDB’s time series support competes with custom products like InfluxDB, and there are more features we didn’t cover in this article such as clustering policies, tagging, custom time series indexes, and “incremental” time series.

Want to dive deeper? All code samples are available on GitHub in C# and Node.js formats. A sample database is provided that you can import into your RavenDB instance (or try it for free with RavenDB Cloud).

.

Leave a Comment