Start with a baseline
If the signal cannot beat a simple baseline, stop there. A ridge model or a plain linear regressor gives you a clean reference point. If the fancy model only wins on paper, the baseline is telling you what the market already knew.
Check chronology
Use walk-forward validation or at least a chronological split. Random shuffles make the score look cleaner than it is. If performance collapses as time moves forward, the signal was probably keyed to one regime.
Read the score, not just the average
A useful signal should rank better opportunities higher, not just print one good summary number. Check the score distribution, the hit rate in the top buckets, and whether the result stays stable across folds.
Then test the trade rule
A score becomes a signal only when it crosses a threshold and turns into a trade. That cutoff should be tied to the frequency you actually want, then checked again after costs and slippage are applied.
The blunt rule
If the signal only works when the split, threshold, or cost model is flattering it, the signal is not ready.
Common mistakes
- Judging only the headline score. That hides weak rank order and fragile thresholds.
- Skipping the baseline. If a simpler model gets the same result, use the simpler model.
- Ignoring friction. Gross performance is not tradable performance.