bybit-orderbook-backtester

⚠Review·Scanned 2/17/2026

This skill downloads, processes, and backtests ByBit order book data using scripts/download_orderbook.py, scripts/process_orderbook.py, and scripts/backtest.py and writes outputs to ./data/raw, ./data/processed, and ./reports. It contains explicit shell commands (e.g., pip install ..., python scripts/...) and performs network access to https://www.bybit.com/derivatives/en/history-data.

from clawhub.ai·vb428b67·64.4 KB·0 installs

Scanned from 1.0.0 at b428b67 · Transparency log ↗

$ vett add clawhub.ai/davidm413/bybit-orderbook-backtesterReview findings below

ByBit Order Book Backtester

End-to-end pipeline: download → process → backtest → report.

Dependencies

pip install undetected-chromedriver selenium pandas numpy pyarrow --break-system-packages

Chrome/Chromium must be installed for Selenium.

Workflow

The pipeline has 3 stages. Run them sequentially, or skip to later stages if data is already available.

Stage 1: Download Order Book Data

Prompt the user for:

Symbol (default: BTCUSDT)
Date range (default: last 30 days)

Run scripts/download_orderbook.py:

python scripts/download_orderbook.py \
  --symbol BTCUSDT \
  --start 2024-06-01 --end 2024-06-30 \
  --output ./data/raw

Key details:

Downloads from https://www.bybit.com/derivatives/en/history-data
Automatically chunks into 7-day windows (ByBit's limit)
Uses undetected-chromedriver for Cloudflare bypass
Outputs: ZIP files in ./data/raw/ named {date}_{symbol}_ob500.data.zip
For data format details: see references/bybit_data_format.md

If Selenium fails (Cloudflare blocks, UI changes): Instruct the user to manually download from the ByBit page and place ZIPs in ./data/raw/.

Stage 2: Process & Filter to Depth 50

Run scripts/process_orderbook.py:

python scripts/process_orderbook.py \
  --input ./data/raw \
  --output ./data/processed \
  --depth 50 \
  --sample-interval 1s

What it does:

Reads JSONL from ZIPs (each line = full 500-level L2 snapshot)
Filters to top 50 bid/ask levels
Computes derived features: mid_price, spread, volume_imbalance, microprice
Optionally downsamples (e.g., 1s, 5s, 1min) — recommended for faster backtests
Outputs: Parquet files in ./data/processed/

Without downsampling: ~860K snapshots/day, ~300 MB Parquet per day per symbol. With 1s downsampling: ~86K snapshots/day, ~5 MB per day — much more practical.

Stage 3: Backtest Strategies

Run scripts/backtest.py:

# Run all 10 strategies
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --output ./reports

# Run specific strategies
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --strategies imbalance,breakout,market_making \
  --output ./reports

# Quick test with limited rows
python scripts/backtest.py \
  --input ./data/processed/BTCUSDT_ob50.parquet \
  --max-rows 100000 \
  --output ./reports

Strategy keys: imbalance, breakout, false_breakout, scalping, momentum, reversal, spoofing, optimal_execution, market_making, latency_arb

Outputs in ./reports/:

{SYMBOL}_backtest_report.json — Full results with equity curves
{SYMBOL}_backtest_report.md — Comparison table and detailed metrics

Report metrics per strategy: total trades, winners/losers, win rate, cumulative PnL, Sharpe ratio, max drawdown (absolute and %), avg PnL per trade, avg hold time, profit factor, best/worst trade, equity curve.

For strategy logic and tunable parameters: see references/strategies.md

Customization

To modify strategy parameters, edit the __init__ method of any strategy class in scripts/backtest.py. Each strategy's self.params dict contains all tunables.

To add a new strategy:

Subclass Strategy in scripts/backtest.py
Implement on_snapshot(self, row, idx, df) with entry/exit logic
Register in STRATEGY_MAP

Troubleshooting

Selenium can't load ByBit page: ByBit uses Cloudflare. Ensure undetected-chromedriver is up to date. Try --no-headless to debug visually. Fall back to manual download.

Out of memory on processing: Use --sample-interval 1s or larger. Process one day at a time.

No trades generated: Strategy thresholds may be too tight for the data period. Relax parameters (lower thresholds, shorter lookbacks) in references/strategies.md.