bybit-orderbook-backtester
This skill downloads, processes, and backtests ByBit order book data using scripts/download_orderbook.py, scripts/process_orderbook.py, and scripts/backtest.py and writes outputs to ./data/raw, ./data/processed, and ./reports. It contains explicit shell commands (e.g., pip install ..., python scripts/...) and performs network access to https://www.bybit.com/derivatives/en/history-data.
ByBit Order Book Backtester
End-to-end pipeline: download → process → backtest → report.
Dependencies
pip install undetected-chromedriver selenium pandas numpy pyarrow --break-system-packages
Chrome/Chromium must be installed for Selenium.
Workflow
The pipeline has 3 stages. Run them sequentially, or skip to later stages if data is already available.
Stage 1: Download Order Book Data
Prompt the user for:
- Symbol (default: BTCUSDT)
- Date range (default: last 30 days)
Run scripts/download_orderbook.py:
python scripts/download_orderbook.py \
--symbol BTCUSDT \
--start 2024-06-01 --end 2024-06-30 \
--output ./data/raw
Key details:
- Downloads from
https://www.bybit.com/derivatives/en/history-data - Automatically chunks into 7-day windows (ByBit's limit)
- Uses
undetected-chromedriverfor Cloudflare bypass - Outputs: ZIP files in
./data/raw/named{date}_{symbol}_ob500.data.zip - For data format details: see
references/bybit_data_format.md
If Selenium fails (Cloudflare blocks, UI changes): Instruct the user to manually download from the ByBit page and place ZIPs in ./data/raw/.
Stage 2: Process & Filter to Depth 50
Run scripts/process_orderbook.py:
python scripts/process_orderbook.py \
--input ./data/raw \
--output ./data/processed \
--depth 50 \
--sample-interval 1s
What it does:
- Reads JSONL from ZIPs (each line = full 500-level L2 snapshot)
- Filters to top 50 bid/ask levels
- Computes derived features: mid_price, spread, volume_imbalance, microprice
- Optionally downsamples (e.g.,
1s,5s,1min) — recommended for faster backtests - Outputs: Parquet files in
./data/processed/
Without downsampling: ~860K snapshots/day, ~300 MB Parquet per day per symbol. With 1s downsampling: ~86K snapshots/day, ~5 MB per day — much more practical.
Stage 3: Backtest Strategies
Run scripts/backtest.py:
# Run all 10 strategies
python scripts/backtest.py \
--input ./data/processed/BTCUSDT_ob50.parquet \
--output ./reports
# Run specific strategies
python scripts/backtest.py \
--input ./data/processed/BTCUSDT_ob50.parquet \
--strategies imbalance,breakout,market_making \
--output ./reports
# Quick test with limited rows
python scripts/backtest.py \
--input ./data/processed/BTCUSDT_ob50.parquet \
--max-rows 100000 \
--output ./reports
Strategy keys: imbalance, breakout, false_breakout, scalping, momentum, reversal, spoofing, optimal_execution, market_making, latency_arb
Outputs in ./reports/:
{SYMBOL}_backtest_report.json— Full results with equity curves{SYMBOL}_backtest_report.md— Comparison table and detailed metrics
Report metrics per strategy: total trades, winners/losers, win rate, cumulative PnL, Sharpe ratio, max drawdown (absolute and %), avg PnL per trade, avg hold time, profit factor, best/worst trade, equity curve.
For strategy logic and tunable parameters: see references/strategies.md
Customization
To modify strategy parameters, edit the __init__ method of any strategy class in scripts/backtest.py. Each strategy's self.params dict contains all tunables.
To add a new strategy:
- Subclass
Strategyinscripts/backtest.py - Implement
on_snapshot(self, row, idx, df)with entry/exit logic - Register in
STRATEGY_MAP
Troubleshooting
Selenium can't load ByBit page: ByBit uses Cloudflare. Ensure undetected-chromedriver is up to date. Try --no-headless to debug visually. Fall back to manual download.
Out of memory on processing: Use --sample-interval 1s or larger. Process one day at a time.
No trades generated: Strategy thresholds may be too tight for the data period. Relax parameters (lower thresholds, shorter lookbacks) in references/strategies.md.