Common Subexpression Elimination
A primer
Jo Devriendt
- Motivation
- #403 in CPMpy
- ManyWorlds approach
- Essence aproach [Gent-Miguel-Rendl]
[Gent-Miguel-Rendl] Common Subexpression Elimination in Automated Constraint Modelling, 2008
Overview
Motivation
+
*
a
x
y
=
0
+
b
=
t
a+x*y=0
b+x*y=t
*
x
y
+
z1
a
=
0
z1
=
a+z1=0
z1=x*y
b+z2=t
z2=x*y
+
z2
b
=
t
z2
=
*
x
y
*
x
y
+
z1
a
=
0
z1
=
a+z1=0
b+z1=t
z1=x*y
+
z1
b
=
t
Extra propagation: assume t=6 and b=0
Standard propagation routines can propagate z=6 and then a=-6
Not possible in two previous slides!
*
x
y
#403 in CPMpy
- Works on previous example :)
- Is the representation unique for non-equivalent expressions?
- E.g., we have a var called "1", is
x >= 1
the version with the var or the val?
- E.g., we have a var called "1", is
- Is the representation canonical?
- E.g., do
x*y
andy*x
use the same reification var?
- E.g., do
- Global dictionary?
- Incorporate non-reification auxiliaries?
x xor y xor z
linearizes tox+y+z = 2*aux+1
butaux
is notx xor y xor z
- When creating a new reification var for expression E, check whether there already exists a var for E.
- Use a dictionary from reification vars to E's string representation.
- Always use original reification var
- Define equality =, hash, comparison < on all expressions
- originally based on string representation, but then became quadratic in memory (every expression keeps a string representing its full subtree)
- based on subexpressions - fixes var "1" problem (uniqueness)
- Expression factory has a global hash set of existing expressions. When creating a new expression, return equal one if exists.
- expression tree is now a DAG
+
*
a
x
=
0
+
b
=
t
a+x*y=0
b+x*y=t
*
x
y
y
+
a
=
0
+
b
=
t
a+x*y=0
b+x*y=t
*
x
y
ManyWorlds
- Define equality =, hash, comparison < on all expressions
- originally based on string representation, but then became quadratic in memory (every expression keeps a string representing its full subtree)
- based on subexpressions - fixes var "1" problem (uniqueness)
- Expression factory has a global hash set of existing expressions. When creating a new expression, return equal one if exists.
- expression tree is now a DAG
- After creation, each expression has a unique ID (counter or ptr)
-
exploited during simplifications: (p & q) | (p & q) become p & q by a simple ID check
-
comparison < based on ID (no recursive compares needed)
-
-
All commutative expressions have subterms ordered by ID
-
fixes x+y vs y+x problem (canonicity)
-
ManyWorlds
Summarized: DAG of canonical expressions
- ManyWorlds unnests as late as possible
- When unnesting, check whether reification var is already introduced for this canonical expression
+
- simple invariants
- memory efficient
- strong simplification
-
- memory inefficient
(keep all expressions in memory at all time) - expressions need to be immutable
Essence
Core idea: keep track of a sets of equivalent subexpressions with a canonical representative
Syntactical equivalence
- easy to check
- very limited
Semantical equivalence
- very hard to check
- very powerful
Easy semantical equivalences
- simplification (evaluation): 1+x*y-1 is equivalent to x*y
- ordering: x*y is equivalent to y*x
- model equalities: a+b = x*y means a+b and x*y are equivalent
- reifications (from flattening): c+x*y = 0 flattens to c+aux = 0 and aux = x*y
equivalent set:
{x*y, y*x, 1+x*y-1, a+b, aux}
Essence
Total order on expressions
- order based on type
- values < variables < equalities < ... < sums < ... < global constraints < ...
- within same type, order recursively based on arguments
- "x=3+z" < "x=z+3" < "3+z=x" < "z+3=x"
Smaller expressions are simpler!
Core idea: keep track of a sets of equivalent subexpressions with a canonical representative
Replace all syntactical occurrences of equivalent set with smallest element in set -> canonical representative
equivalent set:
{x*y, y*x, 1+x*y-1, a+b, aux}
Essence
Core idea: keep track of a sets of equivalent subexpressions with a canonical representative
+ strong reduction of number of variables/constraint
can avoid unnesting!
- e.g., x*y = a and x*y+z = 0 reduces to a+z=0
- e.g., x*y = a+b and x*y+z = 0 reduces to a+b+z=0
Replace all syntactical occurrences of equivalent set with canonical representative.
- more complex
- e.g., what if a new canonical representative is found?
Can an expert modeler avoid duplicate subexpressions?
My interpretation from [Gent-Miguel-Rendl]:
Yes. E.g., most duplicates are from quantor unrolling.
will introduce duplicate unrolled instantiations of
but the following equivalent model will not
Can an expert modeler avoid duplicate subexpressions?
My interpretation from [Gent-Miguel-Rendl]:
Yes. E.g., most duplicates are from quantor unrolling.
My opinion: No unless the expert generates the low-level constraints
Expecting expert modelers to anticipate and fix this is the same as expecting them to do the transformation to low level constraints.
p <-> alldiff(x,y,z)
p -> alldiff(x,y,z)
~p -> ~alldiff(x,y,z)
E.g., CPMpy's uniform reification transforms
to the following, with common subexpressions,
Common Subexpression Elimination
By Jo Devriendt
Common Subexpression Elimination
- 261