Model Methodology
The model predicts fair market capitalization from financial statement data using machine learning. It learns historical relationships between company fundamentals (revenue, profits, debt, cash flows) and market valuations across thousands of stocks.
Mispricing signals are relative, not absolute. A stock showing 20% mispricing means the current market cap exceeds the model's predicted fair value by 20% based on fundamentals alone. This indicates overvaluation — investors are willing to pay beyond what fundamentals suggest, which could reflect growth expectations, brand value, or other intangibles not captured by financial statements.
Each quarter is trained independently — the model only compares companies within the same quarter. This means:
XGBoost
Gradient Boosted Decision Trees
log(market_cap)
Log-transformed for numerical stability
n_estimators: 200
max_depth: 5
learning_rate: 0.1
subsample: 0.8
colsample_bytree: 0.8
objective: reg:absoluteerror
Fixed parameters ensure consistency across quarters. No hyperparameter tuning is performed.
The model uses repeated K-fold cross-validation to generate prediction distributions. This approach prevents data leakage and provides uncertainty estimates.
10
CV Repeats
5
Folds per Repeat
50
Predictions per Stock
Each row shows one fold. Blue = training data, Red = test data (held-out).
Features are extracted from quarterly financial statements. The model uses a combination of raw fundamentals and financial ratios.
| Feature | Category | Transform | Fill Strategy |
|---|---|---|---|
| Total Revenue | Fundamentals | log1p | Required |
| Gross Profit | Fundamentals | log1p | Zero |
| EBITDA | Fundamentals | log1p | Median |
| Net Income | Fundamentals | - | Zero |
| Total Debt | Balance Sheet | log1p | Zero |
| Total Cash | Balance Sheet | log1p | Zero |
| Free Cash Flow | Cash Flow | - | Zero |
| Profit Margin | Ratio | - | Median |
| Debt-to-Equity | Ratio | log | Median |
| ROE / ROA | Ratio | - | Median |
Feature availability across ~32,000 quarterly snapshots:
mispricing = (actual_mcap - predicted_mcap) / actual_mcap
Current market cap exceeds model's predicted fair value. Suggests potential overvaluation — investors are paying beyond fundamentals.
Current market cap is below model's predicted fair value. Suggests potential undervaluation based on fundamentals.
Raw mispricing exhibits a systematic size effect: smaller companies tend to show positive mispricing while larger companies show negative mispricing. This reflects the historical "size premium" where smaller companies trade at higher multiples.
size_neutral_mispricing = raw_mispricing - size_premium(market_cap)
The size premium is estimated by fitting a smooth curve (spline or polynomial) to the mispricing vs. market cap relationship. This correction isolates stock-specific mispricing from the systematic size effect.
relative_std = prediction_std / actual_mcap
Higher relative standard deviation indicates less confident predictions. Stocks with unusual financial profiles or sparse comparable data will have higher uncertainty.
Backtest results measure whether historical mispricing signals predicted future price movements.
IC = correlation(mispricing_signal, future_return)
On the dashboard, IC is displayed such that positive = good signal (mispricing predicted subsequent returns correctly).
hit_rate = % of stocks where mispricing direction matched return direction
A hit rate above 50% indicates the signal has some directional predictive power. However, magnitude of returns matters more than hit rate for portfolio construction.
P-values are corrected using the Benjamini-Hochberg procedure to control false discovery rate when testing multiple hypotheses (horizons x sectors/indices).
Backtests are run across multiple forward-looking horizons (e.g., 5, 10, 21, 63, 126 trading days) to understand signal persistence and decay. Shorter horizons capture momentum effects while longer horizons reflect fundamental mean reversion.
This tool is for research and educational purposes only. The mispricing signals should not be used as the sole basis for investment decisions. Always consult with a qualified financial advisor and conduct your own due diligence.