Machine Learning Applications in Finance: the Why, How, and What Now
Zissis Poulos
Excellence Club Discussion

Does Finance benefit from ML?
Big surge in some sectors
- Risk Modeling/Management
2. Portfolio Management
3. Algorithmic Trading
4. Loan/insurance underwriting
5. Derivatives pricing
Slow adoption in others
1. Empirical asset pricing/forecasting
2. AML/CFT and fraud detection
3. Robo-advisors
A timeline
early 60s - late 90s: quantitative methods behind the veil of corporations
2011: The theory of Hard and Soft Information taken seriously
Role of language and communication in finance gains attention
Early 2000s: Hedge Funds wake up (AQR, Two Sigma, Renaissance), High-frequency Trading
2019-20: Risk-aware reinforcement learning for options hedging
2017-18: Pricing derivatives with deep NNs | "Deep Hedging" by JP Morgan and universities
2016+: Academia wakes up
1987/1998: Black Monday and the LTCM crash.
"random fluctuations of the market viewed as random movements of particles "
linear/logistic regression, decision trees, basic NNs for forecasting
2020: Basel Committee places more focus on risk, ignites new resarch
2022+: Data-driven models of volatility | Market simulators | Advanced NLP for asset pricing
Market Simulators
It's not only about price movements


Restricted Boltzman Machine

Conditional VAE


Temporal Convolutions

Conditional GAN



Is it any good??

Conditional VAE


Two-sample tests on distributions
TSNE/PCA
Two-side forecasting
These are just sanity checks!!
Stylized Facts
- Heavy tails (non-Gaussian)
- Volatility clustering
- The leverage effect
- No autocorrelation of returns
- Decreasing autocorrelation in absolute returns
Did you really learn the DGP?? No...
Price movements are small piece of puzzle
The Implied Volatility Surface

European Options

Price movements are small piece of puzzle
The Implied Volatility Surface

Some key points
- Smile/skew/smirk
- Different structure per asset
- IVS time dynamics vary
VAEs are good at modelling "static" IVS

Complete partially observed surfaces
Better than PCA/interpolation
Can generate entirelly synthetic surfaces
Arbitrage-free
Maxime Bergeron, Nicholas Fung, John Hull, Zissis Poulos and Andreas Veneris, "Variational Autoencoders: A Hands-Off Approach to Volatility",
the Journal of Financial Data Science Spring 2022, 4 (2) 125-138;
Proof of Concept - Currency Pairs


What you need is Joint Models


Most recent work

Functional PC
Neural SDEs

Choudhary, Vedant and Jaimungal, Sebastian and Bergeron, Maxime, FuNVol: A Multi-Asset Implied Volatility Market Simulator using Functional Principal Components and Neural SDEs (March 3, 2023)
Open Questions
- Standardization for quality of generated data is required
- Data-driven (ML) vs. Parametric model comparison
- Parametric models are also improving now*
- Simulators have not been rigorously evaluated on downstream tasks
- Portfolio optimization
- Derivatives hedging
- Algorithmic trading
- No good simulators for intra-day data
*Francois, Pascal and Galarneau-Vincent, Rémi and Gauthier, Genevieve and Godin, Frédéric, Joint Dynamics For The Underlying Asset and Its Implied Volatility Surface: A New Methodology For Option Risk Management (January 7, 2023). Available at SSRN: https://ssrn.com/abstract=4319972
Deep Hedging
What is hedging of options?
-
Traders are required to protect portfolios against unfavourable market movements
-
Sensitivity to market movements is measured using the "Greeks"
-
Delta: first derivative of option value wrt asset price
-
Sensitivity to small moves of underlying
-
-
Gamma: second derivative of option value wrt asset price
-
Sensitivity to wider range of moves
-
-
Vega: first derivative of option value wrt asset volatility
-
- In complete markets with zero frictions and continuous hedging we have a toolbox in mathematical finance to find the OPTIMAL hedge.
REALITY: markets are incomplete, there are frictions (costs, price impact, illiquidity) and hedging is discrete
What do traders do then?
-
Partial hedging
-
Based on heuristics, intuition and experience
-
Calls for automation
BUT
-
Searching the space of partial hedge policies is intractable
-
Stochastic environment
-
Decisions made on daily basis or more often
-
Every decision affects future decisions
-
Hedging horizons can be long (30/60/90/120 days)
-
Sequential decision making in stochastic environments?


Standard RL objective:
"maximize expected future return"
State
Action
Reward
market variables + current position
position to take in hedging instrument
Profit&Loss (P&L) - Transaction Costs
Does this objective work in a finance setting?
NEVER!

Trading desks care about
- Mean-Variance
- Value-at-Risk (VAR)
- Conditional VAR (CVAR)
Standard RL formulation does not work here
Distributional Reinforcement Learning

If you can estimate future return disribution you can take actions that optimize risk-return tradeoff
Distributions and actions (policy) approximated via NNs
D4PG-QR

Cao, Jay and Chen, Jacky and Farghadani, Soroush and Hull, John C. and Poulos, Zissis and Wang, Zeyu and yuan, Jun,
Gamma and Vega Hedging Using Deep Distributional Reinforcement Learning, Frontiers in Artificial Intelligence, Sec. Artificial Intelligence in Finance Volume 6 (2023)
Outperforming the basic automated strategies


Open Questions
- Robustness to model misspecification?
- Realistic synthetic data for training
- Market simulators can probably help
- Adaptive agents that model their own uncertainty
- Dynamic risk measures and time-consistent policies
- Active field of research
ML and Soft Information
Hard vs. Soft Information in Finance
Hard Information
- Price action (returns, volatility)
- Company earnings
- Company 10K filings
- Analyst Forecasts
- "Earnings Surprise"
Easy to quantify and externally verify
Soft Information
- Company press release
- Management communication
- Earnings calls
- News and social media
- Etc.
Qualitative
Literature agrees: earnings surprise only explains 10% of variance in asset prices
What else drives price formation and how??
From static dictionaries to ML
Loughran & McDonald (2011)
"When is a liability not a liability?"
The LM Lexicon

Garcia and Rohrer (2022)
"The colour of finance words"
Multinomial Inverse Regression
From basic ML to advanced NLP
...and beyond sentiment, to semantics



FIN
https://huggingface.co/ProsusAI/finbert
Things we achieved with such models
Sentiment
- Explains short-term market reactions
- Does not explain macro variables well
- Depends on source
- Sentiment of analysts matters more than sentiment of managers!
Semantics
- Stronger than sentiment
- Information Retrieval models quantify what analysts say
Uncertainty ("hedging" in lingustics)
- Explains sector variables better than sentiment: volatility, housing market etc.
Economics go deeper
- Framework of "disagreement of opinions"
- When hard and soft disagree in polarity
- Markets under-react
- Price drift
- Uncertainty takes time to be resolved
- When hard and soft agree
- Overreaction
To be released soon

Interesting open questions
- "Do analysts do their jobs?"
- "What is the role of central bank communication in reducing market uncertainty?"
- "Does sentiment explain changes in bond credit spread?"
- "Are markets efficient at resolving disagreement?"
Contact: zissis.poulos@rotman.utoronto.ca
AI Collusion
The Problem: Emergent Collusion from Learning Agents
-
Learning agents (RL) experiment, observe profits, and adapt.
-
In repeated competitive markets, they can converge to high-price strategies.
-
No communication, no intent
-
just learning dynamics drifting into tacit collusion
-
The Problem: Emergent Collusion from Learning Agents
-
Learning agents (RL) experiment, observe profits, and adapt.
-
In repeated competitive markets, they can converge to high-price strategies.
-
No communication, no intent
-
just learning dynamics drifting into tacit collusion
-


Market Price
Profits
-
Antitrust assumes collusion requires communication or intent:
-
AI breaks that assumption.
-
-
Pricing algorithms can sustain high-price equilibria that are economically indistinguishable from cartel behavior.
-
Undercutting gets punished automatically, because agents learn to retaliate.
-
Regulators have no clear legal framework for algorithmic tacit collusion.

?
-
Repeated interactions → agents learn the consequences of price moves.
-
Low exploration → they settle into stable high-price patterns.
-
Price Trigger: Deviations (one agent lowering price) trigger learned punishment.
-
Market-level signals (prices, profits, demand) act as implicit communication channels.
Δ - cartel "score": 1 means perfect cartel, 0 means competitive Nash

Noise level (retail traders)
Noise level (retail traders)
Open Questions
1. Is over-pruning bias the real culprit?
-
Does collusion disappear if we remove function-approximation pathologies?
-
Or do agents still gravitate toward high-price equilibria even with “perfect” value estimates?
2. What if agents are risk-averse, not pure profit-maximizers?
-
Does CVaR/RL risk aversion push agents toward competition or toward stabilizing collusion?
-
How do volatility, uncertainty, and tail-risk perception reshape strategies?
3. Can mitigation be embedded directly into the RL agent?
-
Reward shaping, exploration policies, fairness penalties.
-
Can we design “competition-preserving” RL?
Open Questions
4. How much observability is enough to trigger collusion?
-
Price only? Profits? Competitor actions?
-
What is the minimal information channel required for tacit coordination?
5. Does algorithmic homogeneity amplify collusion?
-
If every firm uses similar RL models, architectures, and data pipelines, is collusion inevitable?
-
Does diversity of algorithms restore competitive behavior?
ESC301 Guest Lecture
By zpoulos
ESC301 Guest Lecture
This is the slide deck that I use for a quick introduction to the Decentralized Finance class.
- 639