Probabilistic Voting Model
Lecture 3, Political Economics I
OSIPP, Osaka University
27 October, 2017
Masa Kudamatsu
Did you come up with
a research question
for your term paper?
And is it
original, interesting, and feasible?
For your inspiration #1
Here's the list of policies my undergrad class students have chosen
to write their term paper on why they aren't adopted
For your inspiration #2
Taro Kono's blog (on his early days as a legislator)
For your inspiration #3
Political science books on why DPJ government failed
For your inspiration #4
Questions raised by the last general elections in Japan
Should Prime Minister be allowed to dissolve the parliament?
Do majoritarian elections really help opposition parties?
Why doesn't Japan introduce full proportional representation like in continental Europe?
For your inspiration #5
Instead of voting for one candidate / party,
vote by policy issue
His survey of 75,000 Dutch citizens shows
the correlation between their policy preferences
and the policies proposed by the party they vote
is only 10-20%
Motivations for today
Last week we saw how the citizen-candidate model
relaxes the commitment assumption
The Median Voter Theorem relies on restricting assumptions
This week we'll see how the probabilistic voting model
relaxes the assumption on voters' preference
Single-peaked / Single-crossing for one-dimensional policy
Intermediate preference for multi-dimensional policy
Example: Distribution of Local Public Goods
Demography
Endowment
\(N\) citizens live in several districts, each indexed by \(J\)
Population of district \(J\): \(N^J\)
(\(\sum_J N^J = N\))
Every citizen earns exogenous income \(y\)
(for simplicity)
Example: Distribution of Local Public Goods
Preference of each citizen in district \(J\)
Private
consumption
Local
public goods
per capita
where \(H(\cdot)\) is increasing and concave, with \(H(0)=0\)
Government
collects lump sum tax \(\tau\) from every citizen
Government budget constraint
provides per capita local public goods to each district, \(g^J\)
Example: Distribution of Local Public Goods (cont.)
District \(J\) citizens' preference over policies
Example: Distribution of Local Public Goods (cont.)
District \(J\) citizens' preference over policies
Their ideal policy is given by
No "median" policy exists: no equilibrium in Downsian model
Example: Distribution of Local Public Goods (cont.)
Example: Distribution of Local Public Goods (cont.)
Probabilistic voting model does have an equilibrium
as long as citizens' policy preference is continuous and concave
District \(J\) citizens' preference over policies
For the purpose of understanding distortion by politics,
derive the first-best policy:
Sum of utilities
Resource constraint
Example: Distribution of Local Public Goods (cont.)
First-best given by:
Basic Model
We follow Sections 3.4 & 7.4 of Persson and Tabellini (2000)
Originally proposed by Lindbeck and Weibull (1987)
Players
Each belongs to a group indexed by \(J\)
Two candidates, \(A, B\)
Denote the population share of group \(J\) by \(\alpha^J\)
\(N\) citizens, each indexed by \(i\)
Candidates' preference and action
We assume they can commit to this policy
Each candidate's payoff is given by:
Before the election, each announces the policy vector \(\mathbf{q}_A, \mathbf{q}_B\)
Note: we no longer need to stick to one-dimensional policy
So we assume office-seeking politicians
\(R\) if elected (ego-rent)
\(0\) otherwise (normalization)
A way to get round the commitment assumption
Assume there are two political parties, A and B
Each party selects a candidate based on their ideal policy
Remaining issue:
Does the party stick to its candidate after winning the election?
Yes in the U.S.
No in Japan: the ruling party often replaces its leader (ie. Prime Minister) between elections
Citizens' preference
Citizens care about (1) policies and (2) who wins
Citizens' preference
Citizens care about (1) policies and (2) who wins
where \(P\) denotes the winning candidate
Note:
All citizens in the same group have the same preference on policies
\(W^J(\cdot)\) is a continuous and concave function
\(W^J(\cdot)\) doesn't have to be single-peaked, single-crossing, or intermediate
Citizens' preference
Citizens care about (1) policies and (2) who wins
where \(D_B\) is 1 if B wins and 0 otherwise
Individual ideological bias
Population-wide popularity for candidate B
These two parameters are unknown to candidates
Information
Candidates face uncertainty about citizens' preference
Known
Unknown
But candidates know the distribution of the unknown parameters
Information (cont.)
Assume the following distributions for population-wide shock:
uniformly distributed over
Density
To allow explicit solutions...
Information (cont.)
uniformly distributed over
Assume the following distributions for individual bias for B:
Density
Biased for B
Biased for A
To allow explicit solutions...
Information (cont.)
uniformly distributed over
Density
For higher \(\phi^J\)
Assume the following distributions for individual bias for B:
To allow explicit solutions...
measures how homogenous group J is
Information (cont.)
These distributions can be generalized
as long as its unimodal and symmetric around the mean
e.g. Normal distribution
Timing of Events
1
2
3
4
Candidates A and B simultaneously announce
their policy platform
Nature picks individual ideological bias \(\sigma^{iJ}\) and aggregate popularity shocks \(\delta\) for B
Citizens decide which candidate to vote for
The winning candidate implements the announced policy
Analysis
Backward induction
1
2
3
4
Candidates A and B simultaneously announce
their policy platform
Popularity shocks for B, both individual (\(\sigma^{iJ}\)) and aggregate (\(\delta\)), realize
Citizens decide which candidate to vote for
The winning candidate implements the announced policy
Citizens' optimization
Citizen i of group J votes for A if
or
Density
Vote for B
Vote for A
Citizens' optimization (cont.)
Backward induction
1
2
3
4
Candidates A and B simultaneously announce
their policy platform
Popularity shocks for B, both individual (\(\sigma^{iJ}\)) and aggregate (\(\delta\)), realize
Citizens decide which candidate to vote for
The winning candidate implements the announced policy
Candidate A's optimization
where the winning probability is given by
A's vote share
How does \(\pi_A\) depend on \(\mathbf{q}_A\)?
Density
Vote for B
Vote for A
Thus A's vote share among group J citizens is given by
Candidate A's vote share among group J citizens
Candidate A's vote share among all citizens
(Remember \(\alpha^J\) denotes group J's population share)
where
Unlike in the Downsian model,
here the vote share is a continuous function of policies
thanks to stochastic \(\sigma^{iJ}\)
Candidate A's vote share among all citizens
where
But candidates maximize the winning probability, not the vote share
If \(\delta\) is known, \(\Pr(\pi_A\geq 1/2)\) is not continuous w.r.t. \(\mathbf{q}_A\)
Candidate A's vote share among all citizens
where
Random variable when candidates choose their policy
Make the winning probability a continuous function of policies
Candidate A's winning probability
Rearranging terms...
Now we want to know the probability that \(\pi_A\geq1/2\)
Candidate A's winning probability
Rearranging terms...
That is, the probability that these terms are larger than zero
Candidate A's winning probability
That is, the probability that this inequality holds
Candidate A wins if
or
or
where
(average homogeneity)
Candidate A's winning probability (cont.)
Candidate A's winning probability
Candidate A wins if
Now since
is uniformly distributed within
we can obtain the probability that
Density
A wins
Candidate A's winning probability (cont.)
Candidate A's winning probability (cont.)
Note: The winning probability is a continuous function of the policy
cf. This is not the case for the Downsian model
Candidate A's winning probability
Candidate B's winning probability
Both candidates' objective is
concave in their own action (i.e. policy)
continuous in the other candidate's action
A Nash equilibrium exists
(see e.g. Section 1.3.3 of Fudenberg and Tirole 1991)
Both candidates solve the same maximization problem
In the equilibrium, both candidates propose the same policy
Candidate A's winning probability
Candidate B's winning probability
Density
Vote for B
Vote for A
In the equilibrium...
Implications
Implications from Probabilistic Voting Model
1
Equilibrium policies
maximize the weighted sum of citizens' payoffs
2
Politicians target swing voters, not loyal voters
3
Median voters are not necessarily decisive
Equilibrium policy maximizes a weighted sum of citizen payoffs
More populous or more homogenous groups
are treated better
Macroeconomists often use this result to endogenize policies
e.g. Song et al. (2012 Econometrica) on government debt
Implication 1
Each candidate maximizes the winning probability given by
Group J's weight: \(\alpha^J \phi^J\)
Implication 2: Swing Voter Hypothesis
Swing voters: those not loyal to any candidate/party (無党派層)
In the model, those groups with high \(\phi^J\)
Implication 2: Swing Voter Hypothesis
Swing voters: those not loyal to any candidate/party (無党派層)
The model predicts
politicians mostly please those groups with high \(\phi^J\)
We now illustrate the swing voter hypothesis
in the example of distribution of local public goods
that we saw at the beginning of this lecture
See Section 7.4 of Persson and Tabellini (2000) for detail
District \(J\) citizens' preference over policies
Example: Distribution of Local Public Goods (revisited)
First-best policies:
Distribution of Local Public Goods
Candidate's problem
where
That is,
where
Distribution of Local Public Goods (cont.)
Candidate's problem
Distribution of Local Public Goods (cont.)
FOC w.r.t. \(g^J_A\)
Candidate's problem
Or
Or
Distribution of Local Public Goods (cont.)
FOC w.r.t. \(g^J_A\)
Switch to vote for A
For district \(J\)
The gain of voters in district \(J\)
Distribution of Local Public Goods (cont.)
FOC w.r.t. \(g^J_A\)
Switch to vote for B
For districts \(I\neq J\)
The loss of voters in districts \(I\neq J\)
Distribution of Local Public Goods (cont.)
FOC w.r.t. \(g^J_A\)
Or
Or
Distribution of Local Public Goods
FOC w.r.t. \(g^J_A\)
For groups with \(\phi^J\) higher than the average (\(\sum_I\alpha^I\phi^I\))
i.e. Politically more responsive voter groups
receive more than socially optimal
Swing Voter Hypothesis
More generally speaking, politicians target voter groups like
0
instead of targeting voter groups like
0
See Proposition 1 of Casey (2015) for the formal argument on this
Swing Voter Hypothesis
More generally speaking, politicians target voter groups like
0
instead of targeting voter groups like
0
We'll see evidence for this hypothesis shortly
Implication 3:
Median voters are not necessarily decisive
See Section 3.4 of Persson and Tabellini (2000) for detail
We illustrate this with a policy example (size of government)
from Lecture 1
Size of government
Preference of citizens in group \(J\)
Three groups of citizens \(J \in {P, M, R} \)
Group \(J\) citizens earn an exogenous income \( y^J \) with
Each group's population share (\(\alpha^J\)) is less than 1/2
Demography
Endowment
Population is normalized to be 1
Size of government
Government budget constraint
Citizen's preference over policy
Each group's ideal policy is given by
Size of government
Median voter's ideal policy
will be the equilibrium in the Downsian model
Figure 3.2 of Persson and Tabellini (2000)
Size of government
Median voter's ideal policy
will be the equilibrium in the Downsian model
Size of government
Candidate's problem
where
That is,
where
Candidate's problem (cont.)
FOC w.r.t. \(g_A\)
Size of government
FOC w.r.t. \(g_A\)
This term gets closer to \(y^M\)
if \(\phi^M\) is higher than the average
Otherwise,
the equilibrium policy can be far away from the median voter's ideal
Implication 3:
Median voters are not necessarily decisive
Evidence
Empirical challenge
How can we measure "swing voters"?
The literature often uses
winning margin in the previous election
But it's endogenous to distributive policies
The incumbent narrowly won maybe because voters complain about the lack of distributions
(see Larcinese et al. 2013 for a survey and criticism)
(In Lecture 4, we'll see such a model of politics)
And distribution policies are persistent over time
Evidence from Sierra Leone
Casey (2015) overcomes this difficulty
by exploiting ethnic allegiance to political parties in Sierra Leone
Image source: www.bbc.com/news/world-africa-14094194
Ethnic groups in Sierra Leone
Various ethnic groups live in different parts of the country
Political parties in Sierra Leone
SLPP
APC
Two parties have dominated politics since independence in 1961
Measuring ethnic allegiance to parties
Calculate each ethnic group's bias towards APC by:
% of those
who voted for
APC
% of those
who voted for
SLPP
-
Based on nation-wide voting data (for 2007 presidential election)
(not the district-level, which is a response to district-targeting policy)
Data on voting
Exit polls on local council election day in 2008
1117 voters in 59 randomly selected jurisdictions
Household surveys in 2008
6300 citizens in 634 census enumeration areas
(a nationally representative sample)
In both surveys, respondents report which party they voted
in 2007 presidential election
Take the average vote share from both surveys
Table 1 of Casey (2015)
Ethnicity
Population share (%)
Bias to APC
Loyal to APC
Loyal to SLPP
Swing voters
Ethnicity predicts which party to vote
Since ethnicity cannot be changed,
Ethnicity can be used as
a measure of voters' ideology
that's NOT a response to policy !
Measuring swing voter districts
Population share of ethnic group e in jurisdiction j
Ethnic group e's bias towards ALC
(based on 2004 census)
Appendix Figure 1 of Casey (2015)
Light-coloured areas: swing districts
Empirical strategy
Estimate the following equation by OLS:
Party i's policy for jurisdiction j in district d
Bias to either of the parties
Vector of jurisdiction j's characteristics
District fixed effect
Measure of party policy #1
Electoral campaign spendings during the national elections
in 2007
Measure of government policy #2
Local Government Development Grants (LGDG) during 2004-2007
Fiscal transfer from central to local governments
Spent on roads, agriculture, etc.
Check linearity assumption
Divide the distribution of \(|\alpha_j|\) into 35 groups with equal frequencies
Let \(D_j^k\) indicate jurisdictions falling into the \(k\)th group
Estimate the following regression by OLS
Then plot the estimated \(\beta_1^k\)'s
Source: Figure 1 of Casey (2015)
Swing districts attract campaign spendings
Source: Table 2 of Casey (2015)
Swing districts attract various campaign efforts
Similar results for fiscal transfer
If the bias goes down from maximum to minimum
$19,575
(8,757)
reduction in transfer
Source: page 2430 of Casey (2015)
Source: Table 2 of Casey (2015)
Inference on multiple outcomes
It's time for econometrics...
Inference on multiple outcomes (cont.)
For example, if you have 20 outcomes
the estimated treatment effect
can be statistically significant at the 5% level
for at least 1 outcome by chance
Statistical significance for each outcome is misleading
Inference on multiple outcomes (cont.)
The current standard practice is what Kling et al. (2007) proposed
Transform each outcome into a z-score
1
Take a simple unweighted average across related outcomes
2
3
Estimate the treatment effect on this average z-score
Source: Table 2 of Casey (2015)
Inference on multiple outcomes
Interpretation
Moving from maximum (0) to minimum (1) swing-ness
Campaign efforts go up by 0.89 standard deviation
Applications
Application #1: Incorporate candidate quality
Citizen i of group J votes for A if
Casey (2015) interprets \(\delta\) as the relative quality of candidate B
In the basic model, remember:
Its variance gets smaller if citizens are more informed of candidates
Then targeting such citizens is more effective to gain votes
Application #2: Voter intimidation
In the local public good distribution example...
Swing voters (i.e. high \(\phi^J\)) are "expensive to buy"
In countries like Zimbabwe where incumbent politicians
Consume government revenue
Can intimidate voters from voting
It may pay for incumbents to intimidate swing voters
so they can consume more
Robinson and Torvik (2009) argue...
Other applications for empirical research
Electoral campaign across states by US presidential candidates
Candidate selection by parties across districts in Italy
Impact of radio on New Deal policy in the 1930s US
Other applications for theoretical research
Cause of democratization
Dynamic voting in a macroeconomic model
One more application:
Impact of electoral rules
on the composition of government spending
See also Chapter 8 of Persson and Tabellini (2000)
Model
Players
Three groups of voters
Each group has a continuum of voters with unit mass
Two political parties
(So the total population size is 3)
Each lives in its own district
Citizens' preference
Member i of group J
Transfer to district J
Public
good
if candidate A wins
if candidate B wins
Citizens' preference (cont.)
Population-wide popularity shock for B
uniformly distributed between
Individual ideology for member i of group J
uniformly distributed between
with
and
Graphically...
Policies
Transfer to district J (e.g. roads, schools, hospitals)
Public good (e.g. national defence, social protection)
which satisfy the government budget constraint
Exogenous govt revenue
Political party's preference & action
Each party P maximizes
by committing to a policy vector
Timing of Events
1
2
3
4
Parties A and B simultaneously announce
their policy platform
Nature picks both individual (\(\sigma^{iJ}\)) and aggregate (\(\delta\)) popularity shocks for B
Citizens decide which party to vote for
Electoral rule determines each party's seat share
in legislature
5
The party with the majority of seats forms the government
to implement the announced policy
Electoral rules
Single-district election
(proportional representation)
Multi-district election
(majoritarian)
with sufficiently small \(\bar{\sigma}^1\) and sufficiently large \(\bar{\sigma}^3\)
Denote A's vote share among group J voters by
Party A wins the majority of seats if
This is an example of how we can model political institutions
Focus on some features of political institutions
Model them as the rule of the game / the game structure
In this course, we will see more examples:
Term Limit (Lecture 4)
Presidential vs Parliamentary systems (Lecture 5)
Equilibrium
for Single-district Election
District transfer
Targeting district 2 brings more votes
Public good
Vote gains by \(\Delta g\)
Public good
Vote loss by \(\Delta g\)
Public good in the equilibrium
Equilibrium
for Multiple-district Election
Only district 2 can be swung
District transfer
For districts 1 and 3, the electoral result does not depend on f
Public good
Vote gain by \(\Delta g\)
Public good
Vote loss by \(\Delta g\)
Public good in the equilibrium
Comparison of
the two electoral rules
Public good provision
Multi-district majoritarian elections
Single-district (proportional representation) elections
Equilibrium policy
smaller in multi-district elections
Other theoretical models
(Lizzeri and Persico 2001, Milesi-Ferretti et al. 2002)
derive similar predictions:
Composition of government expenditure is tilted to
group specific transfers
under
majoritarian elections
public good / universal welfare
under
proportional representation
Impact of electoral rules
on the size of government spending
Austen-Smith (2000) and Milesi-Ferretti et al. (2002) predict
Majoritarian elections lead to a smaller government
Evidence
Causal evidence?
Hard to prove causality running from electoral rules to policies
Electoral rules rarely change
Thus impossible to separate the impact of electoral rules
from that of country characteristics
Causal evidence?
We can only check
if correlation is consistent with the theoretical prediction
See also Persson and Tabellini (2003) and Acemoglu (2005)
Run cross-country regressions of fiscal policies on electoral rule
Per capita GDP
Trade openness
Population
% of those aged 16-54
% of those aged over 65
Years of being democracy
Quality of democracy (Freedom House Index)
Dummy for presidential system
Dummy for federal states
Dummy for OECD countries
Dummies for continents
Dummies for legal origins
Electoral rules around the world in the 1990s
Majoritarian
Proportional Representation
Not democratic
(excluded from the sample)
Unconditional mean comparison
Fiscal policy (as % of GDP) |
Central govt expenditure | Social protection |
Majoritarian | 25.6% (8.2) |
4.7% (5.4) |
Proportional representation | 30.8% (11.3) |
10.1% (6.6) |
p-value for two-sample t-test |
Source: Table 1 of Persson and Tabellini (2004)
Note: Standard deviation in parentheses
0.03
0.00
OLS estimation results
Dep. Var. (as % of GDP) |
Central govt expenditure | Central govt revenue | Government deficit | Social protection |
Majoritarian elections |
-6.32*** (2.11) |
-3.68* (2.15) |
-3.15*** (0.87) |
-2.25* (1.25) |
# observations | 80 | 76 | 72 | 69 |
Source: Tables 2 and 4 of Persson and Tabellini (2004)
Countries with majoritarian elections
have a smaller size of government
spend less on social protection (pension, unemployment benefits, child allowance, etc.)
Other impacts of electoral rules
Political selection
Elected politicians are more competent
under a single multi-member district
than under a multiple single-member district
Political Economics lecture 3: Probabilistic Voting Model
By Masayuki Kudamatsu
Political Economics lecture 3: Probabilistic Voting Model
- 1,998