Depth Data Analysis
Binance provides historical best-bid/ask data, including:
best_bid_price: Highest buy pricebest_bid_qty: Volume at best bidbest_ask_price: Lowest sell pricebest_ask_qty: Volume at best asktransaction_time: Timestamp
This dataset excludes deeper order book levels. Our analysis focuses on YGG/USDT from August 7th—a day of extreme volatility with over 9 million data points. Key observations:
- Spread (difference between best ask and bid prices) exceeded 1 tick 20% of the time, indicating unusual market stress.
- Order book imbalances between buy/sell sides were frequent, influencing short-term price movements.
Order Book Imbalance and Mid-Price
The imbalance ratio (I) quantifies asymmetry between buy/sell liquidity:
[
I = \frac{Q_b - Q_a}{Q_b + Q_a}
]
Where:
- (Q_b) =
best_bid_qty - (Q_a) =
best_ask_qty
Mid-price (mid) is traditionally calculated as:
[
mid = \frac{best\_bid\_price + best\_ask\_price}{2}
]
Empirical data shows a correlation between I and future mid-price changes: higher I values predict upward price momentum.
Weighted Mid-Price Optimization
To improve predictions, we define a weighted mid-price incorporating imbalance:
[
weighted\_mid = best\_bid\_price \times (1 - w) + best\_ask\_price \times w
]
Where weight (w = \frac{Q_a}{Q_a + Q_b}).
Findings:
- Weighted mid-price reduces prediction errors compared to simple averaging.
- Non-linear relationships emerge at extreme I values (near -1 or 1), suggesting the need for higher-order adjustments.
Enhanced Model:
A polynomial correction term improves accuracy:
[
weighted\_mid^* = mid + spread \times \left(\frac{I}{2} + \sum_{k=1}^4 a_{2k-1} I^{2k-1}\right)
]
Where coefficients (a_{2k-1}) are calibrated to market data (e.g., (N=8) provides optimal fit).
Key Takeaways
- Precision Matters: Mid-price models must evolve beyond static averages to capture liquidity dynamics.
Data Integration: Real-world strategies should incorporate:
- Multi-level order book data
- Trade execution logs for validation
- Advanced Techniques: Explore micro-price theory (e.g., Stoikov’s 2017 Markov chain approach) for probabilistic weighting.
👉 Discover advanced trading tools to implement these strategies.
FAQ Section
Q1: Why is spread analysis important in high-frequency trading?
A: Spread reflects liquidity and market stress. Wide spreads indicate volatility, impacting execution costs and slippage.
Q2: How does order book imbalance predict price movements?
A: Imbalances signal short-term supply/demand pressure. Thin buy-side liquidity often precedes downward price breaks.
Q3: When should I use weighted mid-price vs. micro-price?
A: Weighted mid-price suits latency-sensitive strategies; micro-price excels in probabilistic execution scenarios.
Q4: Can these models be automated?
A: Yes—algorithmic frameworks like Python backtesting libraries enable real-time deployment.
Q5: What’s the next frontier in mid-price modeling?
A: Machine learning hybrids that adapt to regime shifts (e.g., liquidity crises or flash crashes).