Overview
Introduce Estimation Framework
- Generalizes OLS
- Allows for Nonparametric Effects
- Inherently compositional, even under regularization
Hypothesis: The Right to Counsel assists those who are currently housed at the cost of those who are unhoused
Results: Exploiting ongoing roll-out of the policy across Connecticut, we find little evidence to support this hypothesis, which suggests that the policy scales better than perviously understood
Assess the Effects of the Right to Counsel
Right to Counsel: Provides low income households with legal representation in an eviction case
Regularizing the Forward Pass
Non-ParametriC Clustering
A "global" approach to local sampling corrections
Inherently Compositional
Even under data-dependent regularization
NestS oLS & Supervized learning
Allows for fine control
over the hypothesis space
Clear/Intuitive/Mathematical --> Inductive Bias
data N : Set where
zero : N
suc : N -> N
suc ( suc (suc zero ) )
problem
- When clusters differ in distributions, it can be problematic to not distinguish between these problems
- We have a toy example which highlights this
- We show how this problem can get worse in large dimensions
i.i.d Problem
Cluster Problem
Generalizing Across Zip codes
Need bandwidth that is sensitive to the presence of clusters
Estimand:
The challenge of Higher dimensions
Extending Balestriero [2021]
i.i.d Data
Cluster Data
Linear Map
Feature Map
Probability Space
Random Variables
Probability Model
How exactly does the LLN kick in here?
Kernel Methods
Deep Learning
Supervised
Motivation
Domingos [2020]
Motivation
Domingos [2020]
Reisz Representation Theorem
Kernel Methods
Deep Learning
Supervised
High Level Idea
Domingos [2020]
Cluster
introduction
Context
- 2 Million Eviction fillings each year in U.S.
- Gap in legal representation (90/10) in favor of landlords
Policy
Right to Counsel: Provides low income households with legal representation in an eviction case
Question
Does the Right to Counsel assist those who are housed at the cost of those who are not housed?
This paper
Estimation
- Deep Learning Estimator the "corrects" for the zip-code level assignment
Key Empirical Result
- Linear Models suggest adverse unintended consequences of the policy
- Preferred Model suggests limited to no negative effects
Setting
- Connecticut's state-wide implementation of policy
Practical concerns
(1) Why go beyond the linear model?
(2) Why allow the influence of the zip code to vary across covariates?
Linear Estimator
Locally smooth across clusters
Sampling distribution of a standardized estimate of a linear difference-in-difference model
Motivation
Locally smooth across clusters
Whenever treatment is assigned across clusters, you want your estimator to be able to generalize across clusters.
Zip Code Correction
Cross Sectional
Locally smooth across clusters
Estimand
Observe
Challenge
Generalize across the unobserved zip codes
Locally smooth across clusters
Approach
Zip Code Correction
Repeated Cross Section
Locally smooth across clusters
Estimand
Observe
Challenge
Generalize across the treated zip codes
Locally smooth across clusters
Approach
Time Period
Potential Outcome
Supervised Deep Learning
- Models formed by composing parameterized functions
- Parameters updated via some form of gradient descent
Deep Learning
Regularizing the Forward Pass
- Models formed by composing embellished parameterized functions
inductive bias of algorithm
Linear Regression Diff-in-Diff Fails
Supervised Diff-in-Diff Fails
RFP Diff-in-Diff Fails
Contexts
- State-level adoption of policy
- Well documented staggered roll-out
Motivation
Measurement
- Evictions are often informal (24% of forced moves)
- HUD Rapid Rehousing Data
- Few Barries to Housing
- Limited Financial Support
- Standard Lease Agreement
- Seron et al. 2014, (NYC)
- Greiner, Pattanayak and Hennessy 2013, 2012 (Boston)
- Collinson et al. 2022 (Cook County & NYC)
- Cassidy and Curry. 2022 (NYC)
- Abramson 2022 (San Diego)
Literature Review
- Jacot et at [2018], Nagarajan and Kolter [2019], Wilson [2020], Belkin [2021], Zhang et al. [2021], Balestriero et al. [2021]
- Griewank and Walther [2008], Frostig et al. [2018]
- Finn et al. [2017], Kelly et al. [2020], Domingos [2020]
- simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice. -Zhang [2017]
- We discuss in further detail below how these observations rule out all of VC-dimension, Rademacher complexity, and uniform stability as possible explanations for the generalization performance of state-of-the-art neural networks.
- In contrast with classical convex empirical risk minimization, where explicit regularization is necessary to rule out trivial solutions, we found that regularization plays a rather different role in deep learning. It appears to be more of a tuning parameter that often helps improve the final test error of a model, but the absence of all regularization does not necessarily imply poor generalization error. As reported by Krizhevsky et al. (2012), `2-regularization (weight decay) sometimes even helps optimization, illustrating its poorly understood nature in deep learning.
Deep Learning Theory
Generalizing Across Zip codes
OUtline
Context
Eviction process
Right to Counsel implementation
Policy implementation
- 30% of Evictions & 20% of Renter Population
- Legal Representation: 80% of landlords, 10% of tenants
- Household income less than 80% of median state income (~$79,000 for a family of four)
Policy implementation
- Courts and Landlords inform tenant of the existence of Right to Counsel
MAP
Rapid Rehousing
rapid rehousing
Overview
- Housing Identification Services
- Financial Assistance for housing-related expenses
- Case management services
Features
- No Preconditions to Housing
- Financial assistance typically lasts 6 months
- Typical Lease Agreement
rapid rehousing
Financial
- Start-up/Move-in Costs
- First/Last Month Rent, Security/Utility deposit
- Time limited financial assistance after move-in
Implementation
- "Creativity is encouraged" in the design of the program
1
Experience Homelessness
2
Enter Shelter
4
Find Housing
5
Exit Rapid Rehousing
3
Start Rapid Rehousing Program
Rapid Rehousing timeline
data specifics
Homeless Management Information Systems
- Individual level data
- 3338 Households in 2019-July 2022
- Search length, Gender, Race, Age, Kids, Program Date
Feedforward Neural Net
*Confidence Bands are constructed via random sampling initial weights of neural network
Regularizing the forward pass
*Confidence Bands are constructed via random sampling initial weights of neural network
Difference-in-Difference
*Confidence Bands are constructed via stratified sampling without replacement (75% sampling rate)
Difference-in-Difference With Controls
*Confidence Bands are constructed via stratified sampling without replacement (75% sampling rate)
methodology
- Easier to Implement
- Better adhere to Potential Outcome Framework
- Assess the effects of the policy at Scale
- Observe subset of clusters
- Covariates can differ across clusters
- Assess the effects of the policy at Scale
Cluster Randomized control trials
Treatment assigned at level about unit of interest
Motivation
Tragic Triad
Framework
Prediction
Training
Function
Composition
Regularizing the Forward Pass
estimation framework
ODE
Regularized ODE
Kelly [2020]
Double Machine Learning
- Lasso & Partially Linear Models
Regulization will be problematic
Difficult to fit with partially linear neural network
Double Machine Learning
Same Data Set as Above!
- Every Model has some form of regularization
- Clear Inductive Bias
Extending Domingos [2020]
Implicit Function:
Applying (k) iterations of gradient descent to cluster (c)
Regularization Term:
Ensure updates happen in the right space
Regularized Version of Model Agnostic Meta-Learning
Model
Cluster Specific Model
Regularizing the Forward Pass
Key Idea
Compose functions
Compose "Embellished" Functions
Wrong Space
Overfits in the Tails
Reasonable
motivation
Cross Sectional
Repeated Cross Section
Locally smooth across clusters
Set of Clusters
Treated Clusters
Locally smooth across clusters
Results
Notation
Paper about Scale
Hardware | Compile Time | Compiled Run Time | Ratio |
---|---|---|---|
CPU | 1.7547 | 0.5288 | 3.3185 |
GPU | 2.6512 | 0.0009 | 2806.6818 |
Compute
Composing Models
Policies
Does the Right to Counsel shift the costs of housing to those who are currently without housing?