YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
XGBoost 5-Day Stock Price Predictor for Swing Trading
Author: mohan170802
Models: mohan170802/stock-price-predictor-xgboost
A production-ready XGBoost ensemble trained to predict 5-day-ahead stock price movements for swing trading. Two models per ticker:
- Regression model: predicts log-return over next 5 trading days
- Classification model: predicts probability that price goes UP over next 5 trading days
Supported Tickers
| Ticker | Sector | Description |
|---|---|---|
| SPY | Broad Market | S&P 500 ETF |
| QQQ | Tech-heavy | Nasdaq-100 ETF |
| AAPL | Technology | Apple Inc. |
| MSFT | Technology | Microsoft Corp. |
| TSLA | Automotive | Tesla Inc. |
| NVDA | Technology | NVIDIA Corp. |
| AMD | Technology | Advanced Micro Devices |
| META | Technology | Meta Platforms |
| JPM | Financials | JPMorgan Chase |
| XOM | Energy | Exxon Mobil Corp. |
Model Architecture
- Library: XGBoost 3.2.0
- Type: Gradient Boosted Decision Trees (native Booster API)
- Features: 66 engineered features per sample
- Training data: 15 years of daily OHLCV (2011–2026)
- Validation: Expanding-window walk-forward (5 folds, time-aware)
Feature Engineering
| Category | Features |
|---|---|
| Lag features | Close & volume lags at 1,2,3,5,10,20 days |
| Returns | 1d, 2d, 3d, 5d, 10d, 20d returns |
| Moving averages | SMA & EMA at 5,10,20,50 windows + price ratios |
| Momentum | RSI(14), MACD, MACD signal, MACD histogram |
| Volatility | Bollinger Bands (width, position), ATR(14), rolling std dev |
| Volume | OBV, volume SMA, volume ratio |
| Candlestick | Daily range, upper/lower shadow, body size |
| Calendar | Day-of-week, month, quarter, year, month-start/end flags |
Performance Summary
| Metric | Mean ± Std |
|---|---|
| Regression RMSE | 0.0471 ± 0.0203 |
| Regression R² | −0.0116 ± 0.0296 |
| Directional Accuracy | 0.5683 ± 0.0353 |
| Classification Accuracy | 0.4877 ± 0.0210 |
Key insight: Exact price prediction (R² ≈ 0) is extremely hard — the directional signal at ~57% beats random chance (50%) and is actionable for swing trades when combined with risk management.
Per-Ticker Breakdown
| Ticker | Directional Acc | Classification Acc | RMSE |
|---|---|---|---|
| SPY | 0.614 | 0.488 | 0.0226 |
| QQQ | 0.564 | 0.506 | 0.0331 |
| AAPL | 0.569 | 0.482 | 0.0306 |
| MSFT | 0.589 | 0.490 | 0.0334 |
| TSLA | 0.543 | 0.512 | 0.0804 |
| NVDA | 0.543 | 0.476 | 0.0497 |
| AMD | 0.585 | 0.494 | 0.0544 |
| META | 0.601 | 0.482 | 0.0414 |
| JPM | 0.569 | 0.507 | 0.0304 |
| XOM | 0.568 | 0.506 | 0.0307 |
Files in this Repository
summary.json # Complete training metrics & feature list
TICKER_reg_5d.json # XGBoost regression model (log-return target)
TICKER_clf_5d.json # XGBoost classification model (direction target)
TICKER_reg_5d.pkl # Pickled Booster (convenience)
TICKER_clf_5d.pkl # Pickled Booster (convenience)
stock_predictor.py # Training & feature engineering code
inference.py # Simple inference script
README.md # This file
Usage
Quick inference (single ticker)
import xgboost as xgb
import yfinance as yf
import numpy as np
import pandas as pd
# 1. Load model from Hub
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
repo_id="mohan170802/stock-price-predictor-xgboost",
filename="SPY_reg_5d.json"
)
model = xgb.Booster()
model.load_model(model_path)
# 2. Fetch latest data & engineer features
# (see stock_predictor.py in source for full feature pipeline)
# ...
# 3. Predict
X = xgb.DMatrix(latest_features, feature_names=feature_names)
predicted_log_return = model.predict(X)[0]
predicted_price = current_price * np.exp(predicted_log_return)
Batch inference with the inference script
git clone https://huggingface.co/mohan170802/stock-price-predictor-xgboost
cd stock-price-predictor-xgboost
python inference.py --ticker SPY
Trading Notes
- This is NOT financial advice. Models have ~57% directional accuracy — use strict stop-losses and position sizing.
- Best used as one signal among many (combine with macro analysis, earnings calendar, options flow, etc.).
- Retrain monthly as market regimes evolve; expanding-window CV already simulates this.
- Regression model output is log-return; convert with
price * exp(pred).
Training Details
- Data source: Yahoo Finance via
yfinance - Feature count: 66 (no leakage, all features computed from strictly past data)
- Validation: Expanding-window walk-forward (5 folds)
- Hyperparameters: shallow trees (max_depth=5), strong regularization (α=0.5, λ=1.0), learning_rate=0.03
- Hardware: CPU (no GPU required)
- Training time: ~5 minutes for all 10 tickers
License
MIT — use at your own risk for research & educational purposes.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support