Feature Engineering for OHLCV Trading Models

Raw OHLCV bars flowing into rolling features, a model score, and an activation threshold before trades are taken. — Features come first. The threshold and the trade rule only matter after the inputs stop cheating.

Start with simple bars

OHLCV gives you enough to be dangerous. The useful inputs are usually not exotic: returns, momentum, realized volatility, volume z-scores, range measures, and calendar effects. They are boring because they are real.

What usually works

Returns and momentum. Short-horizon drift and persistence.
Volatility and range. How violent the bar has been lately.
Volume features. Whether participation is unusual.
Calendar encodings. Intraday, day-of-year, or week-of-year structure.

Keep leakage out

The easiest way to cheat is to let a rolling statistic peek into the future. Every feature should be computable from data that existed before the trade decision. If you cannot explain the lag in one sentence, it is probably too loose.

Winsorization and standardization are fine when they are applied only on the training window. Apply them to the full series and you are already leaning on information you should not have.

Threshold calibration comes after scoring

The feature set gives the model a score. The threshold decides whether the score is worth trading. That cutoff should be calibrated to the trade frequency you actually want, then rechecked as the score distribution shifts in walk-forward runs.

The blunt rule

If a feature only works because it was built with hindsight, throw it away. The market will do that for you later.

A small worked set

A practical feature set can be as plain as 5-bar and 20-bar returns, 20-bar realized volatility, a rolling volume z-score, and a simple overnight gap flag. That is usually enough to test whether the model has something real to learn before you start adding more moving parts.

If a simple set already captures the pattern, adding ten more features is usually a distraction, not progress.

Common mistakes

Looking forward by accident. Any rolling statistic must stop at the trade decision.
Standardizing on the full sample. That leaks later data into the past.
Adding features before checking the label. A bad target makes good features look useless.