Costly belief Elicitation
Brandon Williams
Alistair Wilson
NHH: Incorporating Psychology into Economic Models
August 12, 2025
- Experimental economists often give incentives to eliciting beliefs. Why?
- We hope providing incentives leads to collecting better, more accurate beliefs:
- Understanding what is asked requires effort
- Overcome personal motives to distort
- Doing burdensome calculations
- Therefore, if belief elicitation is an effortful exercise, how do we best increase the precision of the expressed belief?
Motivation
- We want to understand what incentives produce honest, deliberative beliefs
Motivation
0
100
20
80
- We want to understand what incentives produce honest, deliberative beliefs
Motivation
0
100
20
80
- We want to understand what incentives produce honest, deliberative beliefs
- We need to understand the psychological costs within a testing environment
- The scale and structure of the marginal costs becomes important
- Drawing a distinction between revealing a true belief and the effort required to generate a deliberative belief
Motivation
- Create a task that mirrors forming a probabilistic belief that requires effort and responds to incentives
- Use experiments on Prolific to understand the relationship between cost, effort, and precision
- Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept
- Understand how hard this task is to guess (zero effort)
- Test different incentive structures
- Apply back to traditional lab experiment
Roadmap
- Create a task that mirrors forming a probabilistic belief that requires effort
Task
- Create a task that mirrors forming a probabilistic belief that requires effort
Task

What is the proportion of blue tokens to total tokens in this urn?
81 blue
63 non-blue
56.25% true amount
=
-
- Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept
Calibration
-
- Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept
Calibration

How to get you to exert effort when formulating your belief?
We start by paying $0.50 if you exactly count:
- Number of blue tokens
- Number of total tokens
Measure accuracy and time taken as a proxy for effort
Vary the difficulty over 5 tasks
-
- Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept
Calibration

Vary the difficulty over 5 tasks
Small with gaps
-
- Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept
Calibration

Vary the difficulty over 5 tasks
Larger with no gaps
-
- Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept
Calibration

Vary the difficulty over 5 tasks
Larger with gaps
-
- Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept
Calibration

Vary the difficulty over 5 tasks
Largest with no gaps
-
- Ten rounds with an easy task or the hard task base pay plus $X (Oprea, 2020)
Calibration: WTP

LHS:
Constant
Difficulty
RHS:
Varying
Difficulty
Always
Pays $.50
If Correct
$X
If Correct
Choose
$X
Calibration: Results (n=250)
Model over (tokens, gaps)

Calibration: Results (n=250)
Model over (tokens, gaps)

Calibration: Results (n=250)
Model over (tokens, gaps)

Calibration: Results (n=250)
Model over (tokens, gaps)

Initial Guesses
-
- Understand how hard this problem is to guess (initial guess treatment)
- Participants have 15 or 45 seconds to form and enter a guess on the proportion
- High powered rewards:
- $2.50 if within 1%
- $1.00 if within 5%
- $0.50 if within 10%
Initial Guesses: Results (N=200)
Initial Guesses: Results (N=200)
Within 10%
Within 5%
Within 1%
Exact
15 Seconds
Initial Guesses: Results (N=200)
Within 10%
Within 5%
Within 1%
Exact
15 Seconds
Initial Guesses: Results (N=200)
Within 10%
Within 5%
Within 1%
Exact
15 Seconds
Initial Guesses: Results (N=200)
Within 10%
Within 5%
Within 1%
Exact
15 Seconds
Initial Guesses: Results (N=200)
Within 10%
Within 5%
Within 1%
Exact
15 Seconds
Initial Guesses: Results (N=200)
Within 10%
Within 5%
Within 1%
Exact
15 Seconds

4.8%
Initial Guesses: Results (N=200)
Within 10%
Within 5%
Within 1%
Exact
15 Seconds

95.1%
Initial Guesses: Results (N=200)
Within 10%
Within 5%
Within 1%
Exact
15 Seconds
Initial Guesses: Results (N=200)
Within 10%
Within 5%
Within 1%
Exact
45 Seconds
Taking Stock
So we have an experimental task that:
- That responds to effort and incentives, where we can scale the difficulty
- We understand the effort required to succeed, and amount we need to pay people
- We can quantify the output at low-effort
Incentives
-
- Vary the reward structure
- BSR-Desc: $1.50 prize with only text description of payoff structure (Vespa & Wilson, 2018)
- BSR-Inf: as above but with quantitative information
-
BSR-NoInf: only know there is a $1.50 prize, no other information on the incentives
-
A "close enough" incentive: $1.50 if within 1%; $0.50 if within 5%
- Vary the reward structure
Incentives

How should effort affect expected reward?
Incentives

Results
Incentives: Accuracy

Use our calibration to construct instruments for difficulty

Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy
Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy

+56%
+47%
Incentives: Accuracy

+56%
+47%
-15%
+16%
Incentives: Research costs
- "Close enough" outperformed BSR on both accuracy and time spent
- Also cheaper in payments to participants (~50%) over BSR
- With a fixed budget, how much more effort could be induced?
- Interesting caveat: the gains here are dwarfed by the gains when we explicitly tell them what to do
Incentives: Research costs

Incentives: Research costs

Related task

Consider a task similar to what we might commonly do in lab
- What is the proportion of tokens with a dot that are blue to total tokens with a dot?
Calculation is the same as Bayes updating experiments
...except they could count?
Tell them:
- Total number of tokens
- Proportion of blue tokens
- Proportion dots | red
- Proportion dots | blue
Bayesian updating

Bayesian updating

Bayesian updating

Can they do the Bayesian task?
Bayesian updating

Bayesian updating

Bayesian updating

Bayesian updating

Bayesian updating

Conclusions and future work
- Close enough incentive works well for inducing effort but requires binomial/cardinal realizations
- Varying the incentives:
- No substantive differences across BSR presentations
- Offering incentives and letting them choose effort is dominated by authority of telling them what to do
- For a Bayesian updating task:
- 50% or Prolific subjects can perform the calculation if given frequentist version
- Need to be given ~$0.55 to offset costs
Can we apply this calibration to subjective beliefs?
Thank you!
Alistair Wilson
Brandon Williams
alistair@pitt.edu
brandon.williams@pitt.edu
-
Some examples of recent papers in belief elicitation:
-
Testing incentive compatibility:
- Danz, Vesterlund, and Wilson, 2022
- Healy and Kagel, 2023
-
"Close enough" payments:
- Enke, Graeber, Oprea, and Young, 2024
- Ba, Bohren, and Imas, 2024
- Settele, 2022
-
QSR or BSR:
- Hoffman and Burks, 2020
- Radzevick and Moore, 2010
- Harrison et al., 2022
-
Others (exact or quartile):
- Huffman, Raymond, and Shvets, 2022
- Bullock, Gerber, Hill, and Huber, 2015
- Prior, Sood, and Khanna, 2015
- Peterson and Iyengar, 2020
-
Testing incentive compatibility:
Literature
Incentives
-
- Vary the reward structure
- BSR with only qualitative information
- Vary the reward structure

Incentives
-
- Vary the reward structure
- BSR with quantitative information
- Vary the reward structure

Incentives
-
- Vary the reward structure
- BSR with only qualitative information
- Vary the reward structure

Incentives
-
- Vary the reward structure
- BSR with quantitative information
- Vary the reward structure

Incentives
-
- Vary the reward structure
- BSR with only qualitative information
- Text description of payoff structure (Vespa & Wilson, 2018)
- BSR with quantitative information
- Full information on the quantitative incentives (Danz et al., 2022)
- A "close enough" incentive
- $1.50 if within 1%; $0.50 if within 5%
- Current use in several papers (e.g. Ba et al., 2024)
- BSR with only qualitative information
- Vary the reward structure
Incentives: accuracy

NHH: Costly Elicitiation
By bjw95
NHH: Costly Elicitiation
- 43