Machine Learning Applications in Finance: the Why, How, and What Now
Zissis Poulos
Excellence Club Discussion
Does Finance benefit from ML?
Big surge in some sectors
2. Portfolio Management
3. Algorithmic Trading
4. Loan/insurance underwriting
5. Derivatives pricing
Slow adoption in others
1. Empirical asset pricing/forecasting
2. AML/CFT and fraud detection
3. Robo-advisors
A timeline
early 60s - late 90s: quantitative methods behind the veil of corporations
2011: The theory of Hard and Soft Information taken seriously
Role of language and communication in finance gains attention
Early 2000s: Hedge Funds wake up (AQR, Two Sigma, Renaissance), High-frequency Trading
2019-20: Risk-aware reinforcement learning for options hedging
2017-18: Pricing derivatives with deep NNs | "Deep Hedging" by JP Morgan and universities
2016+: Academia wakes up
1987/1998: Black Monday and the LTCM crash.
"random fluctuations of the market viewed as random movements of particles "
linear/logistic regression, decision trees, basic NNs for forecasting
2020: Basel Committee places more focus on risk, ignites new resarch
2022+: Data-driven models of volatility | Market simulators | Advanced NLP for asset pricing
Market Simulators
It's not only about price movements
Restricted Boltzman Machine
Conditional VAE
Temporal Convolutions
Conditional GAN
Is it any good??
Conditional VAE
Two-sample tests on distributions
TSNE/PCA
Two-side forecasting
These are just sanity checks!!
Stylized Facts
Did you really learn the DGP?? No...
Price movements are small piece of puzzle
The Implied Volatility Surface
European Options
Price movements are small piece of puzzle
The Implied Volatility Surface
Some key points
VAEs are good at modelling "static" IVS
Complete partially observed surfaces
Better than PCA/interpolation
Can generate entirelly synthetic surfaces
Arbitrage-free
Maxime Bergeron, Nicholas Fung, John Hull, Zissis Poulos and Andreas Veneris, "Variational Autoencoders: A Hands-Off Approach to Volatility",
the Journal of Financial Data Science Spring 2022, 4 (2) 125-138;
Proof of Concept - Currency Pairs
What you need is Joint Models
Most recent work
Functional PC
Neural SDEs
Choudhary, Vedant and Jaimungal, Sebastian and Bergeron, Maxime, FuNVol: A Multi-Asset Implied Volatility Market Simulator using Functional Principal Components and Neural SDEs (March 3, 2023)
Open Questions
*Francois, Pascal and Galarneau-Vincent, Rémi and Gauthier, Genevieve and Godin, Frédéric, Joint Dynamics For The Underlying Asset and Its Implied Volatility Surface: A New Methodology For Option Risk Management (January 7, 2023). Available at SSRN: https://ssrn.com/abstract=4319972
Deep Hedging
What is hedging of options?
Traders are required to protect portfolios against unfavourable market movements
Sensitivity to market movements is measured using the "Greeks"
Delta: first derivative of option value wrt asset price
Sensitivity to small moves of underlying
Gamma: second derivative of option value wrt asset price
Sensitivity to wider range of moves
Vega: first derivative of option value wrt asset volatility
REALITY: markets are incomplete, there are frictions (costs, price impact, illiquidity) and hedging is discrete
What do traders do then?
Partial hedging
Based on heuristics, intuition and experience
Calls for automation
BUT
Searching the space of partial hedge policies is intractable
Stochastic environment
Decisions made on daily basis or more often
Every decision affects future decisions
Hedging horizons can be long (30/60/90/120 days)
Sequential decision making in stochastic environments?
Standard RL objective:
"maximize expected future return"
State
Action
Reward
market variables + current position
position to take in hedging instrument
Profit&Loss (P&L) - Transaction Costs
Does this objective work in a finance setting?
NEVER!
Trading desks care about
Standard RL formulation does not work here
Distributional Reinforcement Learning
If you can estimate future return disribution you can take actions that optimize risk-return tradeoff
Distributions and actions (policy) approximated via NNs
D4PG-QR
Cao, Jay and Chen, Jacky and Farghadani, Soroush and Hull, John C. and Poulos, Zissis and Wang, Zeyu and yuan, Jun,
Gamma and Vega Hedging Using Deep Distributional Reinforcement Learning, Frontiers in Artificial Intelligence, Sec. Artificial Intelligence in Finance Volume 6 (2023)
Outperforming the basic automated strategies
Open Questions
ML and Soft Information
Hard vs. Soft Information in Finance
Hard Information
Easy to quantify and externally verify
Soft Information
Qualitative
Literature agrees: earnings surprise only explains 10% of variance in asset prices
What else drives price formation and how??
From static dictionaries to ML
Loughran & McDonald (2011)
"When is a liability not a liability?"
The LM Lexicon
Garcia and Rohrer (2022)
"The colour of finance words"
Multinomial Inverse Regression
From basic ML to advanced NLP
...and beyond sentiment, to semantics
FIN
https://huggingface.co/ProsusAI/finbert
Things we achieved with such models
Sentiment
Semantics
Uncertainty ("hedging" in lingustics)
Economics go deeper
To be released soon
Interesting open questions
Contact: zissis.poulos@rotman.utoronto.ca
AI Collusion
The Problem: Emergent Collusion from Learning Agents
Learning agents (RL) experiment, observe profits, and adapt.
In repeated competitive markets, they can converge to high-price strategies.
No communication, no intent
just learning dynamics drifting into tacit collusion
The Problem: Emergent Collusion from Learning Agents
Learning agents (RL) experiment, observe profits, and adapt.
In repeated competitive markets, they can converge to high-price strategies.
No communication, no intent
just learning dynamics drifting into tacit collusion
Market Price
Profits
Antitrust assumes collusion requires communication or intent:
AI breaks that assumption.
Pricing algorithms can sustain high-price equilibria that are economically indistinguishable from cartel behavior.
Undercutting gets punished automatically, because agents learn to retaliate.
Regulators have no clear legal framework for algorithmic tacit collusion.
Repeated interactions → agents learn the consequences of price moves.
Low exploration → they settle into stable high-price patterns.
Price Trigger: Deviations (one agent lowering price) trigger learned punishment.
Market-level signals (prices, profits, demand) act as implicit communication channels.
Δ - cartel "score": 1 means perfect cartel, 0 means competitive Nash
Noise level (retail traders)
Noise level (retail traders)
Open Questions
1. Is over-pruning bias the real culprit?
Does collusion disappear if we remove function-approximation pathologies?
Or do agents still gravitate toward high-price equilibria even with “perfect” value estimates?
2. What if agents are risk-averse, not pure profit-maximizers?
Does CVaR/RL risk aversion push agents toward competition or toward stabilizing collusion?
How do volatility, uncertainty, and tail-risk perception reshape strategies?
3. Can mitigation be embedded directly into the RL agent?
Reward shaping, exploration policies, fairness penalties.
Can we design “competition-preserving” RL?
Open Questions
4. How much observability is enough to trigger collusion?
Price only? Profits? Competitor actions?
What is the minimal information channel required for tacit coordination?
5. Does algorithmic homogeneity amplify collusion?
If every firm uses similar RL models, architectures, and data pipelines, is collusion inevitable?
Does diversity of algorithms restore competitive behavior?