This guide provides a comprehensive overview of downloading historical market data for various asset classes, essential for robust strategy backtesting. Learn how to access, organize, and utilize these datasets effectively.
Key Asset Classes and Data Sources
1. Chinese A-Shares (Stocks)
Primary Sources:
- Baostock: Offers extensive historical data but limited to 5-minute intervals for stocks.
π Script file:script/crontab/reboot_sync_a_klines.py - Goldminer: Supports only 90 days of minute-level data in its latest version.
π Script file:script/crontab/reboot_sync_gm_a_klines.py
Best Practices:
- Focus on adjusted (backward-filled) data to avoid dividend complications.
- Download only necessary tickers/timeframes to optimize storage and speed.
π Explore A-Shares datasets
2. Hong Kong Stocks (HKEX)
Primary Source:
- Futu API: Limited by ticker quotas; ideal for selective downloads.
π Script file:script/crontab/reboot_sync_hk_klines.py
Limitations:
- Monitor API call restrictions to avoid service interruptions.
3. Futures Contracts
Primary Sources:
- Tianqin Pro: Requires a paid subscription (15-day free trial available).
π Script file:script/crontab/reboot_sync_futures_klines.py - Goldminer Futures: Similar to stocks, limited to 90-day minute data.
π Script file:script/crontab/reboot_sync_gm_futures_klines.py
Pro Tip:
- Use the trial period to download critical historical data for long-term backtesting.
4. Cryptocurrencies (Binance)
Primary Source:
- Binance USDT Perpetual API: Covers all trading pairs.
π Script file:script/crontab/reboot_sync_currency_klines.py
Notes:
- The
ExchangeBinanceclass auto-updates the database with new klines. - Clear existing tables if initial downloads fail due to timestamp conflicts.
Example Workflow:
# Daily cron job to append fresh data
0 3 * * * /path/to/reboot_sync_currency_klines.py 5. Pre-Packaged Data via QQ Group
Download ready-to-use datasets for local testing:
A. VNPY-Compatible Data
- File:
vnpy_mysql_data.zip - Contents: Select stocks/futures with 1-minute bars since 2019.
B. Cryptocurrency Data
- File:
chanlun_currency_data.zip - Pairs: BTC, ETH, EOS, ETC across 8 timeframes (weekly to 1-minute).
C. Futures Data
- File:
chanlun_futures_data.zip - Contracts: 100+ futures with multi-year 1-minute granularity.
FAQs
Q1: How often should I update my historical data?
Schedule daily downloads for equities/crypto; weekly for less volatile assets.
Q2: Why use backward-filled (adjusted) data for stocks?
It automatically accounts for corporate actions (splits/dividends), simplifying backtests.
Q3: Are there free alternatives to Tianqin Pro for futures?
Goldminerβs 90-day data is the best free tier, but long-term backtests require paid solutions.
Q4: Can I automate all downloads?
Yes! Use cron jobs (Linux) or Task Scheduler (Windows) to run scripts periodically.
Final Tip: Always verify data completeness and consistency before backtesting.