Why is Machine Learning in finance so hard?

Information presented here is for educational purposes only and does not intend to make an offer or solicitation for the sale or purchase of any specific securities, investments, or investment strategies. Investments involve risk and there are no guarantees of any kind. Be sure to first consult with a qualified financial adviser and/or tax professional before implementing any strategy discussed here in. Past performance is not indicative of future performance.

Hardik

Changing Data Distributions

How does a typical quant pipeline look like?

  • Start with a bunch of alphas
  • Go long on the top 10
  • Go short on the bottom 10

But, how does Machine Learning come into the picture?

CIFAR10 is an image classification dataset

"Understanding deep learning requires rethinking generalization" paper from Google Brain

Is there a way to solve this generalization problem?

  • Walk-forward Optimization

Low Predictive Power

  • Low accuracies compared to other domains.
  • POMDP nature of the problem puts an implicit limit on the extent of what can be predicted.

So, what can we do here?

  • Recognize that this is a different domain and adjust your expectations.
  • While general prediction problems like return prediction are very hard, niche problems are easier to deal with.
  • Focus on detecting regimes.

Low signal to noise ratio

If you see a pattern in the dataset, it's more likely to be noise than signal

Model interpretability

  • There is almost always a chance of overfitting if you aren't able to answer touch interpretability questions.
  • The challenge is to find the source of the model signal.

Model accuracy is not correlated with utility function

Model accuracy improvement might not lead to better portfolio returns

  • Evidence for this can be found at all scales in financial markets: from high-frequency trading to long-term investing.

Wait, why can't we directly optimize the utility function?

  1. Write down the utility function in terms of the raw data.
  2. Use quadratic solvers or gradient descent to find the weights.
  • This is a hard problem, often leading to simplification of the utility function.
  • The optimization process is unstable and unreliable.

Reinforcement Learning is another option to directly optimize the utility function

  • Doesn't suffer from oversimplification issues.
  • RL state can be made sufficiently complex.
  • But the state doesn't have enough information to guide the agent in the right direction.
  • Lack of enough predictive information often causes the agents to wander in random directions.

Questions?

Challenges and Solutions: Machine Learning in Finance

By Hardik Patel

Challenges and Solutions: Machine Learning in Finance

  • 1,278