Machine Learning Applications in Finance: the Why, How, and What Now

Zissis Poulos
 

Credit to everyone at FinHub

Does Finance benefit from ML?

Big surge in some sectors

  1. Risk Modeling/Management

2. Portfolio Management

3. Algorithmic Trading

4. Loan/insurance underwriting

5. Derivatives pricing 

Slow adoption in others

1. Empirical asset pricing/forecasting

2. AML/CFT and fraud detection

3. Robo-advisors

A timeline

early 60s - late 90s: quantitative methods behind the veil of corporations

2011: The theory of Hard and Soft Information taken seriously

Role of language and communication in finance gains attention

Early 2000s: Hedge Funds wake up (AQR, Two Sigma, Renaissance), High-frequency Trading

 

2019-20: Risk-aware reinforcement learning for options hedging

2017-18: Pricing derivatives with deep NNs | "Deep Hedging" by JP Morgan and universities

2016+: Academia wakes up

1987/1998: Black Monday and the LTCM crash. 

 "random fluctuations of the market  viewed as random movements of particles "

linear/logistic regression, decision trees, basic NNs for forecasting

2020: Basel Committee places more focus on risk, ignites new resarch

2022+: Data-driven models of volatility | Market simulators | Advanced NLP for asset pricing

Market Simulators

It's not only about price movements

Restricted Boltzman Machine

Conditional VAE

Temporal Convolutions

Conditional GAN

Is it any good??

Conditional VAE

Two-sample tests on distributions

TSNE/PCA

Two-side forecasting

These are just sanity checks!!

Stylized Facts

  1. Heavy tails (non-Gaussian)
  2. Volatility clustering
  3. The leverage effect
  4. No autocorrelation of returns
  5. Decreasing autocorrelation in absolute returns

Did you really learn the DGP?? No...

Price movements are small piece of puzzle

The Implied Volatility Surface

European Options

Price movements are small piece of puzzle

The Implied Volatility Surface

Some key points

  1. Smile/skew/smirk
  2. Different structure per asset
  3. IVS time dynamics vary

VAEs are good at modelling "static" IVS

Complete partially observed surfaces

 

Better than PCA/interpolation

 

Can generate entirelly synthetic surfaces

 

Arbitrage-free

Maxime Bergeron, Nicholas Fung, John Hull, Zissis Poulos and Andreas Veneris, "Variational Autoencoders: A Hands-Off Approach to Volatility",

the Journal of Financial Data Science Spring 2022, 4 (2) 125-138;

Proof of Concept - Currency Pairs

What you need is Joint Models

Most recent work

Functional PC

Neural SDEs

Choudhary, Vedant and Jaimungal, Sebastian and Bergeron, Maxime, FuNVol: A Multi-Asset Implied Volatility Market Simulator using Functional Principal Components and Neural SDEs (March 3, 2023)

Open Questions

  1.  Standardization for quality of generated data is required
  2. Data-driven (ML) vs. Parametric model comparison
    • Parametric models are also improving now* 
  3. Simulators have not been rigorously evaluated on downstream tasks
    • Portfolio optimization
    • Derivatives hedging
    • Algorithmic trading
  4. No good simulators for intra-day data

*Francois, Pascal and Galarneau-Vincent, Rémi and Gauthier, Genevieve and Godin, Frédéric, Joint Dynamics For The Underlying Asset and Its Implied Volatility Surface: A New Methodology For Option Risk Management (January 7, 2023). Available at SSRN: https://ssrn.com/abstract=4319972

Deep Hedging

What is hedging of options?

  • Traders are required to protect portfolios against unfavourable market movements

  • Sensitivity to market movements is measured using the "Greeks"

    • Delta: first derivative of option value wrt asset price

      • Sensitivity to small moves of underlying

    • Gamma: second derivative of option value wrt asset price

      • Sensitivity to wider range of moves

    • Vega: first derivative of option value wrt asset volatility

  • In complete markets with zero frictions and continuous hedging we have a toolbox in mathematical finance to find the OPTIMAL hedge.

 

REALITY: markets are incomplete, there are frictions (costs, price impact, illiquidity) and hedging is discrete

What do traders do then?

  • Partial hedging

  • Based on heuristics, intuition and experience

  • Calls for automation

 

BUT

  • Searching the space of partial hedge policies is intractable

    • Stochastic environment 

    • Decisions made on daily basis or more often

    • Every decision affects future decisions 

    • Hedging horizons can be long (30/60/90/120 days)

Sequential decision making in stochastic environments?

Standard RL objective:

"maximize expected future return" 

State

Action

Reward

market variables + current position

position to take in hedging instrument

Profit&Loss (P&L) - Transaction Costs

Does this objective work in a finance setting?

NEVER!

Trading desks care about

  1. Mean-Variance
  2. Value-at-Risk (VAR)
  3. Conditional VAR (CVAR)

Standard RL formulation does not work here

Distributional Reinforcement Learning

If you can estimate future return disribution you can take actions that optimize risk-return tradeoff

Distributions and actions (policy) approximated via NNs

D4PG-QR

Cao, Jay and Chen, Jacky and Farghadani, Soroush and Hull, John C. and Poulos, Zissis and Wang, Zeyu and yuan, Jun,

Gamma and Vega Hedging Using Deep Distributional Reinforcement Learning, Frontiers in Artificial Intelligence,  Sec. Artificial Intelligence in Finance Volume 6 (2023)

Outperforming the basic automated strategies

Open Questions

  1. Robustness to model misspecification?
  2. Realistic synthetic data for training
    • Market simulators can probably help
  3. Different transaction cost models
  4. Dynamic risk measures and time-consistent policies
    • Active field of research
  5. The multi-agent case: market makers and market takers

ML and Soft Information

Hard vs. Soft Information in Finance

Hard Information

 

  • Price action (returns, volatility)
  • Company earnings
  • Company 10K filings
  • Analyst Forecasts
    • "Earnings Surprise"

Easy to quantify and externally verify

Soft Information

 

  • Company press release
  • Management communication
  • Earnings calls
  • News and social media
  • Etc.

Qualitative

Literature agrees: earnings surprise only explains 10% of variance in asset prices

What else drives price formation and how??

From static dictionaries to ML

Loughran & McDonald (2011)

 

"When is a liability not a liability?"

The LM Lexicon

Garcia and Rohrer (2022)

 

"The colour of finance words"

Multinomial Inverse Regression

From basic ML to advanced NLP

...and beyond sentiment, to semantics

FIN

https://huggingface.co/ProsusAI/finbert

Things we achieved with such models

Sentiment

  • Explains short-term market reactions
  • Does not explain macro variables well
  • Depends on source
    • Sentiment of analysts matters more than sentiment of managers!

Semantics

  • Stronger than sentiment
  • Information Retrieval models quantify what analysts say

Uncertainty ("hedging" in lingustics)

 

  • Explains sector variables better than sentiment: volatility, housing market etc.

Economics go deeper

 

  • Framework of "disagreement of opinions"
  • When hard and soft disagree in polarity
    • Markets under-react
    • Price drift
    • Uncertainty takes time to be resolved
  • When hard and soft agree
    • Overreaction

To be released soon

Interesting open questions

  1. "Do analysts do their jobs?"
  2. "What is the role of central bank communication in reducing market uncertainty?"
  3. "Does sentiment explain changes in bond credit spread?"
  4. "Are markets efficient at resolving disagreement?"

Contact: zissis.poulos@rotman.utoronto.ca

ESC301 Guest Lecture

By zpoulos

ESC301 Guest Lecture

This is the slide deck that I use for a quick introduction to the Decentralized Finance class.

  • 216