YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

XGBoost 5-Day Stock Price Predictor for Swing Trading

Author: mohan170802
Models: mohan170802/stock-price-predictor-xgboost

A production-ready XGBoost ensemble trained to predict 5-day-ahead stock price movements for swing trading. Two models per ticker:

  • Regression model: predicts log-return over next 5 trading days
  • Classification model: predicts probability that price goes UP over next 5 trading days

Supported Tickers

Ticker Sector Description
SPY Broad Market S&P 500 ETF
QQQ Tech-heavy Nasdaq-100 ETF
AAPL Technology Apple Inc.
MSFT Technology Microsoft Corp.
TSLA Automotive Tesla Inc.
NVDA Technology NVIDIA Corp.
AMD Technology Advanced Micro Devices
META Technology Meta Platforms
JPM Financials JPMorgan Chase
XOM Energy Exxon Mobil Corp.

Model Architecture

  • Library: XGBoost 3.2.0
  • Type: Gradient Boosted Decision Trees (native Booster API)
  • Features: 66 engineered features per sample
  • Training data: 15 years of daily OHLCV (2011–2026)
  • Validation: Expanding-window walk-forward (5 folds, time-aware)

Feature Engineering

Category Features
Lag features Close & volume lags at 1,2,3,5,10,20 days
Returns 1d, 2d, 3d, 5d, 10d, 20d returns
Moving averages SMA & EMA at 5,10,20,50 windows + price ratios
Momentum RSI(14), MACD, MACD signal, MACD histogram
Volatility Bollinger Bands (width, position), ATR(14), rolling std dev
Volume OBV, volume SMA, volume ratio
Candlestick Daily range, upper/lower shadow, body size
Calendar Day-of-week, month, quarter, year, month-start/end flags

Performance Summary

Metric Mean ± Std
Regression RMSE 0.0471 ± 0.0203
Regression R² −0.0116 ± 0.0296
Directional Accuracy 0.5683 ± 0.0353
Classification Accuracy 0.4877 ± 0.0210

Key insight: Exact price prediction (R² ≈ 0) is extremely hard — the directional signal at ~57% beats random chance (50%) and is actionable for swing trades when combined with risk management.

Per-Ticker Breakdown

Ticker Directional Acc Classification Acc RMSE
SPY 0.614 0.488 0.0226
QQQ 0.564 0.506 0.0331
AAPL 0.569 0.482 0.0306
MSFT 0.589 0.490 0.0334
TSLA 0.543 0.512 0.0804
NVDA 0.543 0.476 0.0497
AMD 0.585 0.494 0.0544
META 0.601 0.482 0.0414
JPM 0.569 0.507 0.0304
XOM 0.568 0.506 0.0307

Files in this Repository

summary.json              # Complete training metrics & feature list
TICKER_reg_5d.json        # XGBoost regression model (log-return target)
TICKER_clf_5d.json        # XGBoost classification model (direction target)
TICKER_reg_5d.pkl         # Pickled Booster (convenience)
TICKER_clf_5d.pkl         # Pickled Booster (convenience)
stock_predictor.py        # Training & feature engineering code
inference.py              # Simple inference script
README.md                 # This file

Usage

Quick inference (single ticker)

import xgboost as xgb
import yfinance as yf
import numpy as np
import pandas as pd

# 1. Load model from Hub
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
    repo_id="mohan170802/stock-price-predictor-xgboost",
    filename="SPY_reg_5d.json"
)
model = xgb.Booster()
model.load_model(model_path)

# 2. Fetch latest data & engineer features
# (see stock_predictor.py in source for full feature pipeline)
# ...

# 3. Predict
X = xgb.DMatrix(latest_features, feature_names=feature_names)
predicted_log_return = model.predict(X)[0]
predicted_price = current_price * np.exp(predicted_log_return)

Batch inference with the inference script

git clone https://huggingface.co/mohan170802/stock-price-predictor-xgboost
cd stock-price-predictor-xgboost
python inference.py --ticker SPY

Trading Notes

  • This is NOT financial advice. Models have ~57% directional accuracy — use strict stop-losses and position sizing.
  • Best used as one signal among many (combine with macro analysis, earnings calendar, options flow, etc.).
  • Retrain monthly as market regimes evolve; expanding-window CV already simulates this.
  • Regression model output is log-return; convert with price * exp(pred).

Training Details

  • Data source: Yahoo Finance via yfinance
  • Feature count: 66 (no leakage, all features computed from strictly past data)
  • Validation: Expanding-window walk-forward (5 folds)
  • Hyperparameters: shallow trees (max_depth=5), strong regularization (α=0.5, λ=1.0), learning_rate=0.03
  • Hardware: CPU (no GPU required)
  • Training time: ~5 minutes for all 10 tickers

License

MIT — use at your own risk for research & educational purposes.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support