Machine Learning Applications in Finance: the Why, How, and What Now

Zissis Poulos
Excellence Club Discussion

Does Finance benefit from ML?

Big surge in some sectors

Risk Modeling/Management

2. Portfolio Management

3. Algorithmic Trading

4. Loan/insurance underwriting

5. Derivatives pricing

Slow adoption in others

1. Empirical asset pricing/forecasting

2. AML/CFT and fraud detection

3. Robo-advisors

A timeline

early 60s - late 90s: quantitative methods behind the veil of corporations

2011: The theory of Hard and Soft Information taken seriously

Role of language and communication in finance gains attention

Early 2000s: Hedge Funds wake up (AQR, Two Sigma, Renaissance), High-frequency Trading

2019-20: Risk-aware reinforcement learning for options hedging

2017-18: Pricing derivatives with deep NNs | "Deep Hedging" by JP Morgan and universities

2016+: Academia wakes up

1987/1998: Black Monday and the LTCM crash.

"random fluctuations of the market viewed as random movements of particles "

linear/logistic regression, decision trees, basic NNs for forecasting

2020: Basel Committee places more focus on risk, ignites new resarch

2022+: Data-driven models of volatility | Market simulators | Advanced NLP for asset pricing

Market Simulators

It's not only about price movements

Restricted Boltzman Machine

Conditional VAE

Temporal Convolutions

Conditional GAN

Is it any good??

Conditional VAE

Two-sample tests on distributions

TSNE/PCA

Two-side forecasting

These are just sanity checks!!

Stylized Facts

Heavy tails (non-Gaussian)
Volatility clustering
The leverage effect
No autocorrelation of returns
Decreasing autocorrelation in absolute returns

Did you really learn the DGP?? No...

Price movements are small piece of puzzle

The Implied Volatility Surface

European Options

Price movements are small piece of puzzle

The Implied Volatility Surface

Some key points

Smile/skew/smirk
Different structure per asset
IVS time dynamics vary

VAEs are good at modelling "static" IVS

Complete partially observed surfaces

Better than PCA/interpolation

Can generate entirelly synthetic surfaces

Arbitrage-free

Maxime Bergeron, Nicholas Fung, John Hull, Zissis Poulos and Andreas Veneris, "Variational Autoencoders: A Hands-Off Approach to Volatility",

the Journal of Financial Data Science Spring 2022, 4 (2) 125-138;

Proof of Concept - Currency Pairs

What you need is Joint Models

Most recent work

Functional PC

Neural SDEs

Choudhary, Vedant and Jaimungal, Sebastian and Bergeron, Maxime, FuNVol: A Multi-Asset Implied Volatility Market Simulator using Functional Principal Components and Neural SDEs (March 3, 2023)

Open Questions

Standardization for quality of generated data is required
Data-driven (ML) vs. Parametric model comparison
- Parametric models are also improving now*
Simulators have not been rigorously evaluated on downstream tasks
- Portfolio optimization
- Derivatives hedging
- Algorithmic trading
No good simulators for intra-day data

*Francois, Pascal and Galarneau-Vincent, Rémi and Gauthier, Genevieve and Godin, Frédéric, Joint Dynamics For The Underlying Asset and Its Implied Volatility Surface: A New Methodology For Option Risk Management (January 7, 2023). Available at SSRN: https://ssrn.com/abstract=4319972

Deep Hedging

What is hedging of options?

Traders are required to protect portfolios against unfavourable market movements
Sensitivity to market movements is measured using the "Greeks"
- Delta: first derivative of option value wrt asset price
  - Sensitivity to small moves of underlying
- Gamma: second derivative of option value wrt asset price
  - Sensitivity to wider range of moves
- Vega: first derivative of option value wrt asset volatility
In complete markets with zero frictions and continuous hedging we have a toolbox in mathematical finance to find the OPTIMAL hedge.

REALITY: markets are incomplete, there are frictions (costs, price impact, illiquidity) and hedging is discrete

What do traders do then?

Partial hedging
Based on heuristics, intuition and experience
Calls for automation

BUT

Searching the space of partial hedge policies is intractable
- Stochastic environment
- Decisions made on daily basis or more often
- Every decision affects future decisions
- Hedging horizons can be long (30/60/90/120 days)

Sequential decision making in stochastic environments?

Standard RL objective:

"maximize expected future return"

State

Action

Reward

market variables + current position

position to take in hedging instrument

Profit&Loss (P&L) - Transaction Costs

Does this objective work in a finance setting?

NEVER!

Trading desks care about

Mean-Variance
Value-at-Risk (VAR)
Conditional VAR (CVAR)

Standard RL formulation does not work here

Distributional Reinforcement Learning

If you can estimate future return disribution you can take actions that optimize risk-return tradeoff

Distributions and actions (policy) approximated via NNs

D4PG-QR

Cao, Jay and Chen, Jacky and Farghadani, Soroush and Hull, John C. and Poulos, Zissis and Wang, Zeyu and yuan, Jun,

Gamma and Vega Hedging Using Deep Distributional Reinforcement Learning, Frontiers in Artificial Intelligence, Sec. Artificial Intelligence in Finance Volume 6 (2023)

Outperforming the basic automated strategies

Open Questions

Robustness to model misspecification?
Realistic synthetic data for training
- Market simulators can probably help
Adaptive agents that model their own uncertainty
Dynamic risk measures and time-consistent policies
- Active field of research

ML and Soft Information

Hard vs. Soft Information in Finance

Hard Information

Price action (returns, volatility)
Company earnings
Company 10K filings
Analyst Forecasts
- "Earnings Surprise"

Easy to quantify and externally verify

Soft Information

Company press release
Management communication
Earnings calls
News and social media
Etc.

Qualitative

Literature agrees: earnings surprise only explains 10% of variance in asset prices

What else drives price formation and how??

From static dictionaries to ML

Loughran & McDonald (2011)

"When is a liability not a liability?"

The LM Lexicon

Garcia and Rohrer (2022)

"The colour of finance words"

Multinomial Inverse Regression

From basic ML to advanced NLP

...and beyond sentiment, to semantics

FIN

https://huggingface.co/ProsusAI/finbert

Things we achieved with such models

Sentiment

Explains short-term market reactions
Does not explain macro variables well
Depends on source
- Sentiment of analysts matters more than sentiment of managers!

Semantics

Stronger than sentiment
Information Retrieval models quantify what analysts say

Uncertainty ("hedging" in lingustics)

Explains sector variables better than sentiment: volatility, housing market etc.

Economics go deeper

Framework of "disagreement of opinions"
When hard and soft disagree in polarity
- Markets under-react
- Price drift
- Uncertainty takes time to be resolved
When hard and soft agree
- Overreaction

To be released soon

Interesting open questions

"Do analysts do their jobs?"
"What is the role of central bank communication in reducing market uncertainty?"
"Does sentiment explain changes in bond credit spread?"
"Are markets efficient at resolving disagreement?"

Contact: zissis.poulos@rotman.utoronto.ca

AI Collusion

The Problem: Emergent Collusion from Learning Agents

Learning agents (RL) experiment, observe profits, and adapt.
In repeated competitive markets, they can converge to high-price strategies.
No communication, no intent
- just learning dynamics drifting into tacit collusion

The Problem: Emergent Collusion from Learning Agents

Learning agents (RL) experiment, observe profits, and adapt.
In repeated competitive markets, they can converge to high-price strategies.
No communication, no intent
- just learning dynamics drifting into tacit collusion

Market Price

Profits

Antitrust assumes collusion requires communication or intent:
- AI breaks that assumption.
Pricing algorithms can sustain high-price equilibria that are economically indistinguishable from cartel behavior.
Undercutting gets punished automatically, because agents learn to retaliate.
Regulators have no clear legal framework for algorithmic tacit collusion.

?

Repeated interactions → agents learn the consequences of price moves.
Low exploration → they settle into stable high-price patterns.
Price Trigger: Deviations (one agent lowering price) trigger learned punishment.
Market-level signals (prices, profits, demand) act as implicit communication channels.

Δ - cartel "score": 1 means perfect cartel, 0 means competitive Nash

Noise level (retail traders)

Open Questions

1. Is over-pruning bias the real culprit?

Does collusion disappear if we remove function-approximation pathologies?
Or do agents still gravitate toward high-price equilibria even with “perfect” value estimates?

2. What if agents are risk-averse, not pure profit-maximizers?

Does CVaR/RL risk aversion push agents toward competition or toward stabilizing collusion?
How do volatility, uncertainty, and tail-risk perception reshape strategies?

3. Can mitigation be embedded directly into the RL agent?

Reward shaping, exploration policies, fairness penalties.
Can we design “competition-preserving” RL?

Open Questions

4. How much observability is enough to trigger collusion?

Price only? Profits? Competitor actions?
What is the minimal information channel required for tacit coordination?

5. Does algorithmic homogeneity amplify collusion?

If every firm uses similar RL models, architectures, and data pipelines, is collusion inevitable?
Does diversity of algorithms restore competitive behavior?