SQLAlchemy In-Memory Performance: Faster Backtests with a Pure Python Database
Posted on Sat 10 May 2025 in Software Architecture
Introduction
SQLAlchemy is a powerful ORM for Python, but performance can become a bottleneck, especially for in-memory workloads like backtesting. In this article, we explore the performance differences across SQLAlchemy 1.3, 1.4, and 2.0, and introduce a custom in-memory database written in pure Python that dramatically improves SQLAlchemy in-memory performance for simulations and real-time trading applications.
Experiment Setup
To evaluate performance, we use a realistic scenario: a simple trend-following strategy on intraday futures data.
Backtest Details:
- Event-driven architecture (no vectorization)
- Time range: 2025-01-01 to 2025-03-01
- Instrument: ES (E-mini S&P 500) RTH, 1-minute bars (~15,000 bars)
- Entry logic:
- Go long after 30 consecutive closes above VWAP
- Go short after 30 consecutive closes below VWAP
- Exit logic: static stop-loss and take-profit
Hardware: Intel(R) Core(TM) i5-8350U CPU @ 1.70GHz with 16GB RAM
Note: I intentionally run benchmarks on weaker hardware to expose performance weaknesses and bottlenecks more clearly. Caching is also disabled to stress-test the backtester’s database access pattern (database is queried for open orders/positions on every bar).
Backtesting with SQLAlchemy 1.3 + In‑Memory SQLite
Version 1.3 represents the pre-2.0 SQLAlchemy world, relying heavily on explicit Core/ORM separation and without statement caching.
As expected, performance was the slowest in this configuration.
Backtest runtime: 79.2 seconds
Backtesting with SQLAlchemy 1.4 + In‑Memory SQLite
SQLAlchemy 1.4 was a transitional release, introducing many of the architectural concepts of 2.0. Statement caching and streamlined ORM execution show significant performance benefits here.
Backtest runtime: 19.8 seconds
Backtesting with SQLAlchemy 2.0 + In‑Memory SQLite
Version 2.0 completes the modernization of SQLAlchemy. Unified ORM/Core APIs, native asyncio support, and an overall cleaner execution model reduce boilerplate and runtime overhead even in sync codebases.
Backtest runtime: 17.9 seconds
A Pure In-Memory Python Database for SQLAlchemy Performance
Now let's push beyond SQLite.
sqlalchemy-memory is a pure-Python dialect for SQLAlchemy that implements the full ORM and Core interface in memory.
Why I Built It
In production systems, models like Order
, Position
, and Trade
are stored in a persistent database. But during backtesting, hitting a SQL engine (even SQLite in-memory) adds overhead you don't want. Rewriting your code to use custom classes or lists during backtests is brittle, and maintaining two versions of the logic is error-prone.
sqlalchemy-memory
bridges that gap. It allows you to:
-
Use your real SQLAlchemy models and queries
-
Avoid any SQL driver overhead
-
Run backtests significantly faster
Backtest runtime: 6.4 seconds
SQLAlchemy In-Memory Benchmark Results and Takeaways
Here are the results side-by-side:
Backtest Type | Duration |
---|---|
SQLAlchemy 1.3 + in-memory sqlite | 79.2s |
SQLAlchemy 1.4 + in-memory sqlite | 19.8s |
SQLAlchemy 2.0 + in-memory sqlite | 17.9s |
SQLAlchemy 2.0 + memory dialect | 6.4s |
While SQLAlchemy 2.0 has brought impressive speedups, removing SQL parsing and I/O entirely through a custom memory dialect provides a dramatic boost for performance-critical simulations. You can view the official benchmark here.
Happy coding !