Did  Trading bots resurrect the capm?

Andreas Park and Jinhua Wang


               
SAFE Microstructure Conference 2020
August 18, 2020

 

equity Market structure and institutions have changed fundamentally of the last 20 years

Big question for Microstructure

  • more marketplaces

  • new rules

  • trading technology advances

  • new players

Summary of implications

  • "higher market quality"

  • more liquidity (shares to trade)

  • lower transaction costs

  • higher price efficiency

both long-term and at the margin for rule changes (e.g., automated quotes)

\(\Rightarrow\) Does any of this translate into asset returns?

Big question for Microstructure

changes in fundamental risk

changes in institutions that produce prices

Rosu, Solji, Tham (JFQA 2020)

Trading is now dominated by bots

  • operate autonomously

  • show/have no feelings

  • can implement complex strategies across many markets and assets

\(\Rightarrow\) better/faster/stronger "align" prices?

Bot trading and price "alignment"

  • common in microstructure: prices incorporate "new" information
    • contained in order flow or external news 
    • well-studied \(\Rightarrow\) not our focus
  • asset pricing:
    • prices align across assets because of systematic risk
    • \(\Rightarrow\) bots good at figuring out these relationships
  • estimate a high-frequency market model
    • align \(=\) %returns explained by market returns

\(R_{it}=\alpha_i+\beta_iR_{mt}+\epsilon_{it} ~~\Rightarrow~~~\mathbb{R}^2\)

% stock return variation explained by market returns \((\mathbb{R}^2\))

Back to bots

Back to bots - our thesis: they cause the increase

correlation of monthly series: .56

Paper has FOUR Parts

Introduce a "new" (not quite, but not done before) measure to capture impact of bots

Document the substantial change in the measure across time

Develop a machine learning tool to further underline the causal relationship ("instrumental causal random forests")

Show with traditional means that the shift in the measure was caused by bots

(which explains its biblical length, sorry)

Data, sample, variable construction

  • TAQ monthly data 2003-2014
  • all stocks (not ETFs)
  • market = ETF IWF \(\Rightarrow\) Russell 2000 (also used SPY for S&P500)
  • CRSP for MCAP, daily returns etc
  • COMPUSTAT for VIX
  • (in progress: Ravenpack for "external" news)
  • WRDS Intraday Indicators for #quotes, #transactions, $-volume, market fragmentation (we use 1/x), spreads, etc

Data, sample, variable construction

  • variable of interest: adjusted \(\mathbb{R}^2\) of per security \(i\) regression

\(R_{it}=\alpha_i+\beta_iR_{mt}+\epsilon_{it} .\)

  • run for each day \(d\) in the sample to get \({\mathbb{R}^2}_{id}\)
  • returns \(R_{it}\): NBBO midquote returns for 
    • end 5-second intervals
    • end 5-minute intervals
    • all stocks and Russell 2000 \((R_{mt})\)

Disclaimers

  1. We do not try to run a full asset pricing estimation model ("just the first step in the two-step estimation")
  2. We used other return horizons (1-minute, 15-minute)
  3. Many issues can arise:
    • asynchronous changes of stock vs. market returns
    • many zeros, lack of (two-sided) quotes
    • price grid for stocks too coarse to reflect market movements
    • not all market movements are created equal
    • \(\ldots\) [pick your favourite] \(\Rightarrow\) this list can be continued endlessly \(\ldots\)

Still: the increase in R-squared over time is persistent and robust across all kinds of subsamples and splits of the data

e.g., by market cap quartiles

Algo trading

  • many algo/bot strategies:
    • cross-market/cross-venue arbitrage
    • market-making
    • order-flow-detection
    • execution algos (VWAPs, IS-minimizers)
  • algo/bot measures
    • #quotes
    • quotes-per-trade
    • quotes-per-$100 of volume
  • common "consequences" of bot trading (see Menkveld 2016 for a survey)
    • tighter spreads
    • more market fragmentation

a quick look at correlations ...

a quick look at correlations ...

Simple OLS regressions

So there is a correlation of bot trading and \(\mathbb{R}^2\)
- but is there a causal relationship?

\begin{array}{rcl} \textit{DV}_{it}&=&\beta_1\times\textit{EV}_{it}+\sum_{j=1}^3\beta_{j+1}\textit{controls}_{jit}+\delta_i+\epsilon_t\\\\ \textit{DV}_{it}&=&\sum_{j=1}^3\beta_j\cdot\textit{EV}_{it}\times\textit{Phase $j$}_t+\sum_{j=1}^3\beta_{j+3}\cdot\textit{controls}_{jit}+\sum_{j=1}^2\alpha_j\cdot\textit{Phase $j$}_t+\delta_i+\epsilon_t \end{array}

Establishing causality

  • Need an exogenous increase/decrease in bot trading ...
  • Idea: index exclusion/inclusion
  • Why?
    • market making bots often trade fixed numbers of securities (experience from Canada: all TSX60, all TSX Composite, all Crosslisted, etc)
    • ETF market makers (=bots) trade only index securities
    • \(\Rightarrow\) index events lead to discrete changes in bots who trade a stock 
    • index inclusion does nothing to fundamentals
      \(\Rightarrow\) nothing should change to \(R_i\) or \(\beta_i\)
  • Why not?
    • long literature on index changes finds all kinds of funky stuff

Establishing causality

  • What do we do?
  • Find matched security and do diff-in-diff
    • present results with price and market cap (like Davis & Kim JFM 2005)
    • tried a large number of other matches, with large same effects

Does something happen?

Entry

Exit

effect strongest later in sample

Does something happen?

Entry

Exit

effect strongest later in sample

statistically not significant

Regression Analysis: Two-Stage least square; stage 1

\begin{array}{rcl} \textit{QA}_{it} &=& \beta_1 \textit{event}_{t} + \beta_2\textit{VIX}_t + \textit{controls}_{it} + \delta_i+\epsilon_{it} \\ \textit{DV}_{it}&=& \alpha_1 \widehat{\textit{QA}_{it}}+ \alpha_2\textit{VIX}_t + \textit{controls}_{it} +\delta_i + \epsilon_{it} \end{array}

what's the deal with exits?

  • exits may be different
  • trade & quote based measures can be iffy when activity is high
  • ETF market makers will disappear almost immediately
    \(\Rightarrow\) fragmentation decline
  • many funds (including ETFs) will have to trade out of positions
    \(\Rightarrow\) elevated levels of quotes and trades remain for a while

Regression Analysis: Two-Stage least square; stage 2

\begin{array}{rcl} \textit{QA}_{it} &=& \beta_1 \textit{event}_{t} + \beta_2\textit{VIX}_t + \textit{controls}_{it} + \delta_i+\epsilon_{it} \\ \textit{DV}_{it}&=& \alpha_1 \widehat{\textit{QA}_{it}}+ \alpha_2\textit{VIX}_t + \textit{controls}_{it} +\delta_i + \epsilon_{it} \end{array}

Regression Analysis: Mediation analysis

\begin{array}{rcl} \textit{QA}_{it} &=& \beta_1 \textit{event}_{t} + \beta_2\textit{VIX}_t + \textit{controls}_{it} + \epsilon_{it}\\ \textit{DV}_{it}&=& \alpha_1 \textit{event}_{t}+ \alpha_2 \textit{QA}_{it}+ \alpha_3\textit{VIX}_t + \textit{controls}_{it} + \epsilon_{it} \end{array}

Idea: Mediation analysis allows both a direct and an indirect effect

*ACME=Average Causal Mediation Effect (the mediated effect)

*

*

Establishing causality: Causal random Forest

 

  • Linear Model relies on correctness of functional form specification.

  • Linear causal effect models can only estimate average treatment effect at aggregate level & ignores heterogeneity

  • unbiasedness without requiring “correct” functional form
  • ML approach that
    • constructs model with training data that
    • minimizes the error in validation set, and
    • estimate the treatment effects on held-out dataset.  
  • Preserves treatment effect heterogeneity

  • We use a variant of causal forests called instrumental forests, which uses an instrumental variable to approximate the treatment effects

Causal random Forest

 

NB: methods and tools developed here are all available on Jinhua's GitHub

Causal random Forest: Variable importance for estimating heterogenous effects (inclusion events)

 

The numbers in the table are the percentage splits on a particular covariate. The higher the percentage, the deeper the gradient in the cell, and the more important the variable is in the causal forest. 

Causal random Forest: Local Average Treatment Effect (LATE) inclusion events

 

Summary

Introduce a "new" (not quite, but not done before) measure to capture impact of bots

Document the substantial change in the measure across time

Develop a machine learning tool to further underline the causal relationship ("instrumental causal random forests")

Show with traditional means that the shift in the measure was caused by bots

@financeUTM

andreas.park@rotman.utoronto.ca

slides.com/ap248

sites.google.com/site/parkandreas/

youtube.com/user/andreaspark2812/

Summary

identified a shift in "explainability of intra-day market model"

linked this shift to bot trading, with indications of causality from index inclusions

explainability seems a prerequisite for any "real" asset pricing model

still open questions and many avenues for possible future research

Back to bots

Hendershott, Jones, Menkveld (JF 2011)

Back to bots - our thesis: they cause the increase

correlation of monthly series: .56

Did the Trading Robots Resurrect the CAPM?

By Andreas Park

Did the Trading Robots Resurrect the CAPM?

  • 551