practical learning in A.I.M.
STO
IR
send at best time on contact level
send best incentive on contact level
12 possibilities
5-10 possibilities
best:
most opens
best:
most purchase
STO
IR
each contact has an open probability at every slot
each incentive has an effect
on buying prob
stability
over time
stability
over time
standard A/B test
choose best
0:00
02:00
22:00
20:00
...
02:00
standard A/B test
0:00
02:00
22:00
20:00
...
0:00
02:00
20:00
20:00
20:00
22:00
22:00
22:00
0:00
02:00
02:00
choose best
02:00
standard A/B test
choose best
exploration
exploitation
bandit algo
combines
exploration & exploitation
5%: randomly
95%: to currently best
simple bandit algo
STO
IR
send at best time on contact level
send best incentive on contact level
best:
most opens
on the long run
best:
most purchase
on the long run
bayesian bandit algo
good on the long run
probability of choosing one option is the probability
that it is the best
Beta distribution
probabilities of true open probabilities?
observed: 5/10
what we think based on your history?
28/30 >> 25/30
25/30 >> 1/1
what we think based on your history?
1 learning step
5 open / 10 send
5 open / 11 send
6 open / 11 send
STO 1 learning step
3 open / 10 send
3 open / 11 send
4 open / 11 send
IR 1 learning step
5% buying prob increase
4% buying prob increase
6% buying prob increase
?
several learning steps
5 open / 10 send
25 open / 50 send
+ prior knowledge
open probability: estimate 50%
Emarsys open probability: estimate 17%
Lesara 02:00 open probability: estimate 30%
how to recommend?
probability of choosing one option is the probability that it is the best
sample from each distribution, then choose option
with highest sample value
fast, simple
sample from Beta?
not trivial --> need for UDF
limitations
1. not applicable at all
2. applicable, but time would be important factor
truly personal ?!
STO: learning on contact level
IR: learning on account level
truly personal: buying prob
STO and IR learning
By Czeller Ildi
STO and IR learning
how does the algo behind STO and IR learns from new information and how does it use its available information?
- 986