Common Subexpression Elimination
A primer
Jo Devriendt
 Motivation
 #403 in CPMpy
 ManyWorlds approach
 Essence aproach [GentMiguelRendl]
[GentMiguelRendl] Common Subexpression Elimination in Automated Constraint Modelling, 2008
Overview
Motivation
+
*
a
x
y
=
0
+
b
=
t
a+x*y=0
b+x*y=t
*
x
y
+
z1
a
=
0
z1
=
a+z1=0
z1=x*y
b+z2=t
z2=x*y
+
z2
b
=
t
z2
=
*
x
y
*
x
y
+
z1
a
=
0
z1
=
a+z1=0
b+z1=t
z1=x*y
+
z1
b
=
t
Extra propagation: assume t=6 and b=0
Standard propagation routines can propagate z=6 and then a=6
Not possible in two previous slides!
*
x
y
#403 in CPMpy
 Works on previous example :)
 Is the representation unique for nonequivalent expressions?
 E.g., we have a var called "1", is
x >= 1
the version with the var or the val?
 E.g., we have a var called "1", is
 Is the representation canonical?
 E.g., do
x*y
andy*x
use the same reification var?
 E.g., do
 Global dictionary?
 Incorporate nonreification auxiliaries?
x xor y xor z
linearizes tox+y+z = 2*aux+1
butaux
is notx xor y xor z
 When creating a new reification var for expression E, check whether there already exists a var for E.
 Use a dictionary from reification vars to E's string representation.
 Always use original reification var
 Define equality =, hash, comparison < on all expressions
 originally based on string representation, but then became quadratic in memory (every expression keeps a string representing its full subtree)
 based on subexpressions  fixes var "1" problem (uniqueness)
 Expression factory has a global hash set of existing expressions. When creating a new expression, return equal one if exists.
 expression tree is now a DAG
+
*
a
x
=
0
+
b
=
t
a+x*y=0
b+x*y=t
*
x
y
y
+
a
=
0
+
b
=
t
a+x*y=0
b+x*y=t
*
x
y
ManyWorlds
 Define equality =, hash, comparison < on all expressions
 originally based on string representation, but then became quadratic in memory (every expression keeps a string representing its full subtree)
 based on subexpressions  fixes var "1" problem (uniqueness)
 Expression factory has a global hash set of existing expressions. When creating a new expression, return equal one if exists.
 expression tree is now a DAG
 After creation, each expression has a unique ID (counter or ptr)

exploited during simplifications: (p & q)  (p & q) become p & q by a simple ID check

comparison < based on ID (no recursive compares needed)


All commutative expressions have subterms ordered by ID

fixes x+y vs y+x problem (canonicity)

ManyWorlds
Summarized: DAG of canonical expressions
 ManyWorlds unnests as late as possible
 When unnesting, check whether reification var is already introduced for this canonical expression
+
 simple invariants
 memory efficient
 strong simplification

 memory inefficient
(keep all expressions in memory at all time)  expressions need to be immutable
Essence
Core idea: keep track of a sets of equivalent subexpressions with a canonical representative
Syntactical equivalence
 easy to check
 very limited
Semantical equivalence
 very hard to check
 very powerful
Easy semantical equivalences
 simplification (evaluation): 1+x*y1 is equivalent to x*y
 ordering: x*y is equivalent to y*x
 model equalities: a+b = x*y means a+b and x*y are equivalent
 reifications (from flattening): c+x*y = 0 flattens to c+aux = 0 and aux = x*y
equivalent set:
{x*y, y*x, 1+x*y1, a+b, aux}
Essence
Total order on expressions
 order based on type
 values < variables < equalities < ... < sums < ... < global constraints < ...
 within same type, order recursively based on arguments
 "x=3+z" < "x=z+3" < "3+z=x" < "z+3=x"
Smaller expressions are simpler!
Core idea: keep track of a sets of equivalent subexpressions with a canonical representative
Replace all syntactical occurrences of equivalent set with smallest element in set > canonical representative
equivalent set:
{x*y, y*x, 1+x*y1, a+b, aux}
Essence
Core idea: keep track of a sets of equivalent subexpressions with a canonical representative
+ strong reduction of number of variables/constraint
can avoid unnesting!
 e.g., x*y = a and x*y+z = 0 reduces to a+z=0
 e.g., x*y = a+b and x*y+z = 0 reduces to a+b+z=0
Replace all syntactical occurrences of equivalent set with canonical representative.
 more complex
 e.g., what if a new canonical representative is found?
Can an expert modeler avoid duplicate subexpressions?
My interpretation from [GentMiguelRendl]:
Yes. E.g., most duplicates are from quantor unrolling.
will introduce duplicate unrolled instantiations of
but the following equivalent model will not
Can an expert modeler avoid duplicate subexpressions?
My interpretation from [GentMiguelRendl]:
Yes. E.g., most duplicates are from quantor unrolling.
My opinion: No unless the expert generates the lowlevel constraints
Expecting expert modelers to anticipate and fix this is the same as expecting them to do the transformation to low level constraints.
p <> alldiff(x,y,z)
p > alldiff(x,y,z)
~p > ~alldiff(x,y,z)
E.g., CPMpy's uniform reification transforms
to the following, with common subexpressions,
Common Subexpression Elimination
By Jo Devriendt
Common Subexpression Elimination
 93