periods (notice that the scale of the y axis, which is measuring profitability, is 100 times larger in
Figure 6(b) than in Figure 6(a)). Ideally, we would like to find some compromise horizon, which is
long enough to allow prices to evolve sufficiently in order to beat the spread, but short enough for
microstructure features to be informative of directional movements.
In order to reduce the influence of any long-term directional price drift, we can adjust the learning
algorithm to account for it. Instead of evaluating the total profit per share or return from a buy action
in a given state, we monitor the relative profitability of buying in that state versus buying in every
possible state. For example, suppose a buy action in state S yields 0.03 cents per trade on average;
while that number is positive, suppose always buying (that is, in every state) generates 0.07 cents per
trade on average (presumably because the price went up over the course of the entire period), therefore
making state s relatively less advantageous for buying. We would then assign −0.04 = 0.03 − 0.07
as the value of buying in that state. Conversely, there may be a state-action pair that has negative
payout associated with it over the training period, but if this action is even more unprofitable when
averaged across all states (again, presumably due to a long-term price drift), this state-action pair be
assigned a positive value. Conceptually, such “adjustment for average value” allows us to filter out the
price trend and hone in on microstructure aspects, which also makes learned policies perform more
robustly out-of-sample. Empirically, resulting learned policies recover the desired symmetry, where
if one extremal state learns to buy, the opposite extremal state learns to sell – notice the transformation
from Figure 6(c) to Figure 6(d), where we once again witness mean reversion.
While we clearly see patterns in the short-term price formation process and are demonstratively
successful in identifying state variables that help predict future returns, profiting from this predictabil-
ity is far from trivial. It should be clear from the figures in this section that the magnitude of our
predictions is in fractions of a penny, whereas the tightest spread in liquid US stocks is one cent.
So in no way should the results be interpreted as a recipe for profitability: even if all the features
we enumerate here are true predictors of future returns, and even if all of them line up just right for
maximum profit margins, one still cannot justify trading aggressively and paying the bid-ask spread,
since the magnitude of predictability is not sufficient to cover transaction costs. 12
So what can be done? We see essentially three possibilities. First, as we have suggested earlier,
we could hold our positions longer, so that price changes are larger than spreads, giving us higher
margins. However, as we have seen, the longer the holding period, the less directly informative
market microstructure aspects seem to become, thus making prediction more difficult. Second, we
could trade with limit orders, hoping to avoid paying the spread. This is definitely a fruitful direction,
where one jointly estimates future returns and the probability of getting filled, which then must be
weighed against adverse selection (probability of executing only when predictions turn out to be
wrong). This is a much harder problem, well outside of the scope of this article. And finally, a
third option is to find or design better features that will bring about greater predictability, sufficient to
overcome transaction costs.
It should be clear now that the overarching theme of these suggested directions is that the machine
learning approach offers no easy paths to profitability. Markets are competitive, and finding sources
of true profitability is extremely difficult. That being said, what we have covered in this section is
a framework for how to look for sources of potential profits in a principled way – by defining state
spaces, examining potential features and their interplay, using training-test set methodology, imposing
sensible value functions, etc – that should be a part of the arsenal of a quantitative professional, so
that we can at least discuss these problems in a common language.