Shayan Doroudi
December 3, 2018
University of California, Irvine
Student
Model
Instructional Policy
Activity
Response
Millions of students
Model-Based Instructional Sequencing
Over 500,000 students/year
25 million active monthly users
~12 million active monthly users
Student
Model
Model-Based Instructional Sequencing in 1960s
Photos from suppes-corpus.stanford.edu
“We assume that a mathematical model of learning will provide an approximate description of the student's learning, and the task for a theory of instruction is then to settle the question of how the instructional sequence of concepts, skills, and facts should be organized to optimize for a given student his rate of learning.”
Suppes (1974)
The Place of Theory in Educational Research
AERA Presidential Address
“It would be my prediction that we will see increasingly sophisticated theories of instruction in the near future.”
Student
Model
Instructional Policy
We haven’t seen “increasingly sophisticated theories of instruction”
Student
Model
Instructional Policy
George Box, 1979
Corbett and Anderson, 1994
Student A:
Student B:
Student C:
...
Student A:
Student B:
Student C:
...
Addition
Subtraction
Multiplication
Corbett and Anderson, 1994
Mastery
Learning
Rosen et al., 2018
Ritter et al., 2007
Corbett and Anderson, 1994
Cen, 2009
\(\theta\) - Student Ability \(\sim \mathcal{N}(0, 1)\)
\(\beta\) - Item Difficulty
\(\gamma\) - Learning Rate
P(Correct)
75% of skills in a middle school mathematics tutoring system had P(guess) > 0.5 or P(slip) > 0.5
Baker, Corbett, and Aleven, 2008:
High guess and slip parameters are the result of BKT being unidentifiable.
Beck and Chang, 2007:
BKT Model actually is identifiable.
Doroudi and Brunskill, 2017:
High guess and slip parameters could be due to
fitting the wrong model.
Doroudi and Brunskill, Educational Data Mining 2017, Best Paper Nominee
P(Correct)
Doroudi and Brunskill, Educational Data Mining 2017, Best Paper Nominee
500 students
20 practice opportunities
High P(slip)!
P(Correct)
Doroudi and Brunskill, Educational Data Mining 2017, Best Paper Nominee
Not
Learned
Learned
100 students
200 practice opportunities
High P(guess)!
Doroudi and Brunskill, Educational Data Mining 2017, Best Paper Nominee
P(Correct)
Not
Learned
Learned
Researcher
Incorrect inference
(e.g., throw out questions)
Mastery
Learning
Average P(Correct)
at Mastery:
0.54
P(Correct)
Mastery
Learning
Declare
Mastery
Declare mastery early
New Instructional Policy
New
Model
Fractions
Tutor
Fractions
Tutor
Around 1000 students
New Instructional Policy
New
Model
Fractions
Tutor
Fractions
Tutor
Fractions
Tutor
vs.
Baseline Policy
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
New
Model
New
Model
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
New Instructional Policy
New
Model
Fractions
Tutor
vs.
Baseline Policy
| Baseline Policy |
Adaptive Policy | |
|---|---|---|
| Simulated Results | 5.9 ± 0.9 | 9.1 ± 0.8 |
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
Posttest Scores (out of 16 points)
| Baseline Policy |
Adaptive Policy | |
|---|---|---|
| Simulated Results | 5.9 ± 0.9 | 9.1 ± 0.8 |
| Experimental Results | 5.5 ± 2.6 | 4.9 ± 1.8 |
Posttest Scores (out of 16 points)
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
Fitted
Model
Fitted
Model
Chi et al., 2011
Rowe et al., 2014
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
Instructional
Policy
Fitted
Model
Instructional
Policy
“True”
Model
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
| Baseline Policy |
Adaptive Policy | |
|---|---|---|
| New Model | 5.9 ± 0.9 | 9.1 ± 0.8 |
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
Posttest Scores (out of 16 points)
| Baseline Policy |
Adaptive Policy | |
|---|---|---|
| New Model | 5.9 ± 0.9 | 9.1 ± 0.8 |
| Bayesian Knowledge Tracing | 6.5 ± 0.8 | 7.0 ± 1.0 |
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
Posttest Scores (out of 16 points)
| Baseline Policy |
Adaptive Policy | |
|---|---|---|
| New Model | 5.9 ± 0.9 | 9.1 ± 0.8 |
| Bayesian Knowledge Tracing | 6.5 ± 0.8 | 7.0 ± 1.0 |
| Deep Knowledge Tracing | 9.9 ± 1.5 | 8.6 ± 2.1 |
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
Posttest Scores (out of 16 points)
“The [BKT] model overestimates the true learning and performance parameters for below-average students who make many errors. While these students receive more remedial exercises than the above average students, they nevertheless receive less remedial practice than they need and perform worse on the test than expected.”
Corbett and Anderson, 1994
Corbett and Anderson, 1994
Corbett and Anderson, 1994
Doroudi and Brunskill, Learning Analytics & Knowledge 2019
Even after individualizing BKT parameters, they found that
low-performing students do worse on the test.
This inequity could be due to fitting the wrong model.
Doroudi and Brunskill, 2019:
Solution: Individualize BKT parameters for different students.
Corbett and Anderson, 1994
200 students
20 practice opportunities
200 students
20 practice opportunities
Fast Learners
Slow Learners
Doroudi and Brunskill, Learning Analytics & Knowledge 2019
Average P(Correct)
at Mastery:
0.56
Average P(Correct) at Mastery:
0.45
Mastery
Learning
Mastery
Learning
Fast Learners
Slow Learners
Doroudi and Brunskill, Learning Analytics & Knowledge 2019
Consider how
(1) algorithms,
(2) machine learning,
(3) technology design, and
(4) socio-cultural forces
combine to affect equity in
learning technologies.
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
| Student Models | Policy 1 |
Policy 2 |
Policy 3 |
|---|---|---|---|
| Student Model 1 | |||
| Student Model 2 | |||
| Student Model 3 |
\(V_{SM_1,P_1}\) \(V_{SM_1,P_2}\) \(V_{SM_1,P_3}\)
\(V_{SM_2,P_1}\) \(V_{SM_2,P_2}\) \(V_{SM_2,P_3}\)
\(V_{SM_3,P_1}\) \(V_{SM_3,P_2}\) \(V_{SM_3,P_3}\)
| Student Models |
Baseline Policy |
Adaptive Policy |
|---|---|---|
| New Model | 5.9 ± 0.9 | 9.1 ± 0.8 |
| Bayesian Knowledge Tracing | 6.5 ± 0.8 | 7.0 ± 1.0 |
| Deep Knowledge Tracing | 9.9 ± 1.5 | 8.6 ± 2.1 |
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
Posttest Scores (out of 16 points)
| Student Models |
Baseline Policy |
Adaptive Policy | Awesome Policy |
|---|---|---|---|
| New Model | 5.9 ± 0.9 | 9.1 ± 0.8 | 16 |
| Bayesian Knowledge Tracing | 6.5 ± 0.8 | 7.0 ± 1.0 | 16 |
| Deep Knowledge Tracing | 9.9 ± 1.5 | 8.6 ± 2.1 | 16 |
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
Posttest Scores (out of 16 points)
Doroudi, Aleven, and Brunskill, Learning @ Scale 2017
| Student Models | Policy 1 |
Policy 2 |
Policy 3 |
|---|---|---|---|
| Demographic 1 | |||
| Demographic 2 | |||
| Demographic 3 |
\(V_{SM_1,P_1}\) \(V_{SM_1,P_2}\) \(V_{SM_1,P_3}\)
\(V_{SM_2,P_1}\) \(V_{SM_2,P_2}\) \(V_{SM_2,P_3}\)
\(V_{SM_3,P_1}\) \(V_{SM_3,P_2}\) \(V_{SM_3,P_3}\)
Can tell us which policies are equitable
|
Student Models |
Mastery Learning BKT |
|---|---|
| AFM - Fast Learners | 56% |
| AFM - Slow Learners | 45% |
Doroudi and Brunskill, Learning Analytics & Knowledge 2019
|
Student Models |
Mastery Learning BKT |
|---|---|
| AFM - Fast Learners | 56% |
| AFM - Slow Learners | 45% |
| BKT - Fast Learners | 98%* |
| BKT - Slow Learners | 97.3%* |
*Percent of students who are in learned state.
Doroudi and Brunskill, Learning Analytics & Knowledge 2019
|
Student Models |
Mastery Learning
BKT |
Mastery Learning AFM |
|---|---|---|
| AFM - Fast Learners | 56% | 96% |
| AFM - Slow Learners | 45% | 95% |
| BKT - Fast Learners | 98%* | |
| BKT - Slow Learners | 97.3%* |
*Percent of students who are in learned state.
Doroudi and Brunskill, Learning Analytics & Knowledge 2019
|
Student Models |
Mastery Learning
BKT |
Mastery Learning AFM |
|---|---|---|
| AFM - Fast Learners | 56% | 96% |
| AFM - Slow Learners | 45% | 95% |
| BKT - Fast Learners | 98%* | 99.8%* |
| BKT - Slow Learners | 97.3%* | 99.5%* |
*Percent of students who are in learned state.
Doroudi and Brunskill, Learning Analytics & Knowledge 2019
Student
Model
Student
Model
Doroudi, Aleven, and Brunskill, In Submission
Cognitive
(Information Processing)
DistributedCognition
Constructivism
Socio-Cultural
Situated Cognition
“It can be argued that there is a trade-off between accounting for the subjective experience of doing mathematics and the precision inherent in expressing models in the syntax of computer formalisms.”
Paul Cobb, 1987
“It is desirable to formulate situative models that are specific enough to implement them as simulation programs”
James Greeno, 1998
Socio-Cultural
Model
Cognitive
Model
Policy 1 |
Policy 2 |
Policy 3 |
|
|---|---|---|---|
| Cognitive Model | |||
| Constructivist Model | |||
| Socio-Cultural Model |
\(V_{SM_1,P_1}\) \(V_{SM_1,P_2}\) \(V_{SM_1,P_3}\)
\(V_{SM_2,P_1}\) \(V_{SM_2,P_2}\) \(V_{SM_2,P_3}\)
\(V_{SM_3,P_1}\) \(V_{SM_3,P_2}\) \(V_{SM_3,P_3}\)
Properties of Models of Learning
Sequencing Instruction
Learner-Generated Content
Doroudi, Aleven, & Brunskill - L@S '17
Doroudi & Brunskill - LAK '19
Doroudi, Aleven, & Brunskill - In Submission
Doroudi et al. - EDM '15
Doroudi et al. - EDM '16
Doroudi, Thomas, and Brunskill - UAI '17
*Best Paper*
Doroudi & Brunskill - EDM '17
*Best Paper Nominee*
Doroudi et al. - CHI '16
Doroudi et al. - ICLS '18
Properties of Models of Learning
Sequencing Instruction
Learner-Generated Content
Doroudi, Aleven, & Brunskill - L@S '17
Doroudi & Brunskill - LAK '19
Doroudi, Aleven, & Brunskill - In Submission
Doroudi et al. - EDM '15
Doroudi et al. - EDM '16
Doroudi, Thomas, and Brunskill - UAI '17
*Best Paper*
Doroudi & Brunskill - EDM '17
*Best Paper Nominee*
Doroudi et al. - CHI '16
Doroudi et al. - ICLS '18
This Talk
Assess the robustness of various student models and instructional policies
Study the equitability of learning technologies, including how algorithms interact with socio-cultural factors
Work with online education providers to study how the
consequences in this talk affect actual students
Build student models for settings that we care about
by bridging the theory-model gap
The research reported here was supported, in whole or in part, by the Institute of Education Sciences, U.S. Department of Education, through Grants R305A130215 and R305B150008 to Carnegie Mellon University. The opinions expressed are those of the authors and do not represent views of the Institute or the U.S. Dept. of Education.
Some of the work reported here was written in papers with co-authors Emma Brunskill and Vincent Aleven. I thank Emma Brunskill, Ken Holstein, and Petr Johanes for discussions that influenced this work.
Box, G. E. (1979). Robustness in the strategy of scientific model building. In Robustness in statistics (pp. 201-236).
Cen, H. (2009). Generalized learning factors analysis: improving cognitive models with machine learning (Doctoral dissertation). Carnegie Mellon University, Pittsburgh, PA.
Chi, M., VanLehn, K., Litman, D., & Jordan, P. (2011). Empirically evaluating the application of reinforcement learning to the induction of effective and adaptive pedagogical strategies. User Modeling and User-Adapted Interaction, 21(1-2), 137-180.
Cobb, P. (1990). A constructivist perspective on information-processing theories of mathematical activity. International Journal of Educational Research, 14(1), 67-92.
Corbett, A. T., & Anderson, J. R. (1994). Knowledge tracing: Modeling the acquisition of procedural knowledge. User modeling and user-adapted interaction, 4(4), 253-278.
Doroudi, S., & Brunskill, E. (2017, June). The misidentified identifiability problem of Bayesian Knowledge Tracing. In Proceedings of the 10th International Conference on Educational Data Mining. International Educational Data Mining Society.
Doroudi, S. & Brunskill, E. (2019, March). Fairer but not fair enough: On the equitability of knowledge tracing. To appear in Proceedings of the 9th International Learning Analytics & Knowledge Conference. ACM.
Doroudi, S., Aleven, V., & Brunskill, E. (2017, April). Robust evaluation matrix: Towards a more principled offline exploration of instructional policies. In Proceedings of the Fourth (2017) ACM Conference on Learning@ Scale (pp. 3-12). ACM.
Doroudi, S., Aleven, V. & Brunskill, E. (2018). Where's the reward? A review of reinforcement learning for instructional sequencing. Manuscript in submission.
Duncan, O. D. (1975). Introduction to structural equation models. Elsevier.
Greeno, J. G. (1998). The situativity of knowing, learning, and research. American psychologist, 53(1), 5.
Rowe, J. P., Mott, B. W., & Lester, J. C. (2014). Optimizing Player Experience in Interactive Narrative Planning: A Modular Reinforcement Learning Approach. AIIDE, 3, 2.
New Instructional Policy
Student
Model
vs.
Baseline Policy
Doroudi, Aleven, and Brunskill, In Submission
Better understand researchers' beliefs about learning and computational modeling via interviews (ongoing work).
Use agent-based modeling and social simulation to model socio-cultural and situative theories.
Assess robustness of models under different conceptions of learning.
At least 95% of students learn the skill
Mastery
Learning
Doroudi, Aleven, and Brunskill, In Submission
aprender
to learn
Paired-Association Tasks
Concept Learning Tasks
Sequencing Activity Types
Sequencing Interdependent Content
Doroudi, Aleven, and Brunskill, In Submission
reading
Paired-Association Tasks
Concept Learning Tasks
Sequencing Activity Types
Sequencing Interdependent Content
Doroudi, Aleven, and Brunskill, In Submission
Worked Example
Problem
Solving
\(x^2 - 4 = 12\)
Solve for \(x\):
\(x^2 - 4 = 12\)
\(x^2 = 4 + 12\)
\(x^2 = 16\)
\(x = \sqrt{16} = \pm4\)
\(x^2 - 4 = 12\)
Solve for \(x\):
Paired-Association Tasks
Concept Learning Tasks
Sequencing Activity Types
Sequencing Interdependent Content
Doroudi, Aleven, and Brunskill, In Submission
Paired-Association Tasks
Concept Learning Tasks
Sequencing Activity Types
Sequencing Interdependent Content
Doroudi, Aleven, and Brunskill, In Submission
| Data-Driven Policy Outperformed Baseline |
Mixed Results |
Data-Driven Policy Did Not Outperform Baseline |
|
|---|---|---|---|
| Paired-Association Tasks | 10 | 0 | 3 |
| Concept Learning Tasks | 2 | 3 | 0 |
| Sequencing Activity Types | 4 | 4 | 0 |
| Sequencing Interdependent Content | 0 | 0 | 6 |
Doroudi, Aleven, and Brunskill, In Submission
Paired-Association Tasks
Concept Learning Tasks
Sequencing Activity Types
Sequencing Interdependent Content
Use Psychologically-Inspired Models
Spacing Effect
Worked-Example Effect
Use Data-Driven
Models
We attempt to treat the same problem with several alternative models each with different simplifications but with a common...assumption. Then, if these models, despite their different assumptions, lead to similar results, we have what we can call a robust theorem that is relatively free of the details of the model.
Hence, our truth is the intersection of independent lies.
- Richard Levins, 1966
Estimator that gives unbiased and consistent estimates for a policy!
Can have very high variance when policy is different from prior data.
Example: Worked example or problem-solving?
20 sequential decisions ⇒ need over 2^{20}$20\(2^{20}\) students!
Importance sampling can prefer the worse of two policies more often than not (Doroudi, Thomas, and Brunskill, 2017).
Doroudi, Thomas, and Brunskill, Uncertainty in Artificial Intelligence 2017, Best Paper
200 students
20 practice opportunities
Fast Learners
Slow Learners
Doroudi and Brunskill, Learning Analytics & Knowledge 2019