Costly belief Elicitation

Brandon Williams

Alistair Wilson

NHH:  Incorporating Psychology into Economic Models

August 12, 2025

  • Experimental economists often give incentives to eliciting beliefs. Why?
  • We hope providing incentives leads to collecting better, more accurate beliefs:
    • Understanding what is asked requires effort
    • Overcome personal motives to distort
    • Doing burdensome calculations
  • Therefore, if belief elicitation is an effortful exercise, how do we best increase the precision of the expressed belief?

 

Motivation

  • We want to understand what incentives produce honest, deliberative beliefs

 

Motivation

0

100

20

80

  • We want to understand what incentives produce honest, deliberative beliefs

 

Motivation

0

100

20

80

  • We want to understand what incentives produce honest, deliberative beliefs
    • We need to understand the psychological costs within a testing environment
    • The scale and structure of the marginal costs becomes important
  • Drawing a distinction between revealing a true belief and the effort required to generate a deliberative belief

 

Motivation

  • Create a task that mirrors forming a probabilistic belief that requires effort and responds to incentives
  • Use experiments on Prolific to understand the relationship between cost, effort, and precision
    • Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept
    • Understand how hard this task is to guess (zero effort)
    • Test different incentive structures
  • Apply back to traditional lab experiment

 

Roadmap

  • Create a task that mirrors forming a probabilistic belief that requires effort

 

Task

  • Create a task that mirrors forming a probabilistic belief that requires effort

Task

What is the proportion of blue tokens to total tokens in this urn?

81 blue

63 non-blue

56.25% true amount

=

  •  
    • Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept 

Calibration

  •  
    • Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept

Calibration

How to get you to exert effort when formulating your belief?

We start by paying $0.50 if you exactly count:

  1. Number of blue tokens
  2. Number of total tokens

Measure accuracy and time taken as a proxy for effort

Vary the difficulty over 5 tasks

  •  
    • Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept

Calibration

Vary the difficulty over 5 tasks

Small with gaps

  •  
    • Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept

Calibration

Vary the difficulty over 5 tasks

Larger with no gaps

  •  
    • Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept

Calibration

Vary the difficulty over 5 tasks

Larger with gaps

  •  
    • Vary the cost for precision and calibrate on how long it takes to complete and willingness to accept

Calibration

Vary the difficulty over 5 tasks

Largest with no gaps

  •  
    • Ten rounds with an easy task or the hard task base pay plus $X (Oprea, 2020)

Calibration: WTP

LHS:

Constant

Difficulty

RHS:

Varying

Difficulty

Always

Pays $.50

If Correct

$X 

If Correct

Choose

$X

Calibration: Results (n=250)

Model over (tokens, gaps)

Calibration: Results (n=250)

Model over (tokens, gaps)

Calibration: Results (n=250)

Model over (tokens, gaps)

Calibration: Results (n=250)

Model over (tokens, gaps)

Initial Guesses

  •  
    • Understand how hard this problem is to guess (initial guess treatment)
    • Participants have 15 or 45 seconds to form and enter a guess on the proportion
    • High powered rewards:
      • $2.50 if within 1%
      • $1.00 if within 5%
      • $0.50 if within 10%

Initial Guesses: Results (N=200)

Initial Guesses: Results (N=200)

Within 10%

Within 5%

Within 1%

Exact

15 Seconds

Initial Guesses: Results (N=200)

Within 10%

Within 5%

Within 1%

Exact

15 Seconds

Initial Guesses: Results (N=200)

Within 10%

Within 5%

Within 1%

Exact

15 Seconds

Initial Guesses: Results (N=200)

Within 10%

Within 5%

Within 1%

Exact

15 Seconds

Initial Guesses: Results (N=200)

Within 10%

Within 5%

Within 1%

Exact

15 Seconds

Initial Guesses: Results (N=200)

Within 10%

Within 5%

Within 1%

Exact

15 Seconds

4.8%

Initial Guesses: Results (N=200)

Within 10%

Within 5%

Within 1%

Exact

15 Seconds

95.1%

Initial Guesses: Results (N=200)

Within 10%

Within 5%

Within 1%

Exact

15 Seconds

Initial Guesses: Results (N=200)

Within 10%

Within 5%

Within 1%

Exact

45 Seconds

Taking Stock

So we have an experimental task that:

  • That responds to effort and incentives, where we can scale the difficulty
  • We understand the effort required to succeed, and amount we need to pay people
  • We can quantify the output at low-effort

 

Incentives

  •  
    • Vary the reward structure
      • BSR-Desc:  $1.50  prize with only text description of payoff structure (Vespa & Wilson, 2018)
      • BSR-Inf: as above but with quantitative information
      • BSR-NoInf: only know there is a $1.50 prize, no other information on the incentives

      • A "close enough" incentive: $1.50 if within 1%; $0.50 if within 5%

Incentives

How should effort affect expected reward?

Incentives

Results

Incentives: Accuracy

Use our calibration to construct instruments for difficulty

Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy

Incentives: Accuracy

+56%

+47%

Incentives: Accuracy

+56%

+47%

-15%

+16%

Incentives: Research costs

  • "Close enough" outperformed BSR on both accuracy and time spent
  • Also cheaper in payments to participants (~50%) over BSR
  • With a fixed budget, how much more effort could be induced?
  • Interesting caveat: the gains here are dwarfed by the gains when we explicitly tell them what to do

Incentives: Research costs

Incentives: Research costs

Related task

Consider a task similar to what we might commonly do in lab

  • What is the proportion of tokens with a dot that are blue to total tokens with a dot?

 

 

Calculation is the same as Bayes updating experiments

...except they could count?

Tell them:

  • Total number of tokens
  • Proportion of blue tokens         
  • Proportion dots | red
  • Proportion dots | blue

Bayesian updating

Bayesian updating

Bayesian updating

Can they do the Bayesian task?

Bayesian updating

Bayesian updating

Bayesian updating

Bayesian updating

Bayesian updating

Conclusions and future work

  • Close enough incentive works well for inducing effort but requires binomial/cardinal realizations
  • Varying the incentives:
    • No substantive differences across BSR presentations
    • Offering incentives and letting them choose effort is dominated by authority of telling them what to do
  • For a Bayesian updating task:
    • 50% or Prolific subjects can perform the calculation if given frequentist version
    • Need to be given ~$0.55 to offset costs

Can we apply this calibration to subjective beliefs?

Thank you!

Alistair Wilson

Brandon Williams

alistair@pitt.edu

brandon.williams@pitt.edu

  • Some examples of recent papers in belief elicitation:
    • Testing incentive compatibility:
      • Danz, Vesterlund, and Wilson, 2022
      • Healy and Kagel, 2023
    • "Close enough" payments:
      • Enke, Graeber, Oprea, and Young, 2024
      • Ba, Bohren, and Imas, 2024
      • Settele, 2022
    • QSR or BSR:
      • Hoffman and Burks, 2020
      • Radzevick and Moore, 2010
      • Harrison et al., 2022
    • Others (exact or quartile):
      • Huffman, Raymond, and Shvets, 2022
      • Bullock, Gerber, Hill, and Huber, 2015
      • Prior, Sood, and Khanna, 2015
      • Peterson and Iyengar, 2020

Literature

Incentives

  •  
    • Vary the reward structure
      • BSR with only qualitative information

Incentives

  •  
    • Vary the reward structure
      • BSR with quantitative information

Incentives

  •  
    • Vary the reward structure
      • BSR with only qualitative information

Incentives

  •  
    • Vary the reward structure
      • BSR with quantitative information

Incentives

  •  
    • Vary the reward structure
      • BSR with only qualitative information
        • Text description of payoff structure (Vespa & Wilson, 2018)
      • BSR with quantitative information
        • Full information on the quantitative incentives (Danz et al., 2022)
      • A "close enough" incentive
        • $1.50 if within 1%; $0.50 if within 5%
        • Current use in several papers (e.g. Ba et al., 2024)

Incentives: accuracy