Common Subexpression Elimination

A primer

Jo Devriendt

 [Gent-Miguel-Rendl] Common Subexpression Elimination in Automated Constraint Modelling, 2008

Overview

Motivation

+
*
a
x
y
=
0
+
b
=
t
a+x*y=0
b+x*y=t
*
x
y
+
z1
a
=
0
z1
=
a+z1=0
z1=x*y
b+z2=t
z2=x*y
+
z2
b
=
t
z2
=
*
x
y
*
x
y
+
z1
a
=
0
z1
=
a+z1=0
b+z1=t
z1=x*y
+
z1
b
=
t

Extra propagation: assume t=6 and b=0

Standard propagation routines can propagate z=6 and then a=-6

Not possible in two previous slides!

*
x
y

#403 in CPMpy

  • Works on previous example :)
  • Is the representation unique for non-equivalent expressions?
    • E.g., we have a var called "1", is x >= 1 the version with the var or the val?
  • Is the representation canonical?
    • E.g., do x*y and y*x use the same reification var?
  • Global dictionary?
  • Incorporate non-reification auxiliaries? x xor y xor z linearizes to x+y+z = 2*aux+1 but aux is not x xor y xor z
  • When creating a new reification var for expression E, check whether there already exists a var for E.
  • Use a dictionary from reification vars to E's string representation.
  • Always use original reification var
  • Define equality =, hash, comparison < on all expressions
    • originally based on string representation, but then became quadratic in memory (every expression keeps a string representing its full subtree)
    • based on subexpressions - fixes var "1" problem (uniqueness)
  • Expression factory has a global hash set of existing expressions. When creating a new expression, return equal one if exists.
    • expression tree is now a DAG
+
*
a
x
=
0
+
b
=
t
a+x*y=0
b+x*y=t
*
x
y
y
+
a
=
0
+
b
=
t
a+x*y=0
b+x*y=t
*
x
y

ManyWorlds

  • Define equality =, hash, comparison < on all expressions
    • originally based on string representation, but then became quadratic in memory (every expression keeps a string representing its full subtree)
    • based on subexpressions - fixes var "1" problem (uniqueness)
  • Expression factory has a global hash set of existing expressions. When creating a new expression, return equal one if exists.
    • expression tree is now a DAG
  • After creation, each expression has a unique ID (counter or ptr)
    • exploited during simplifications: (p & q) | (p & q) become p & q by a simple ID check

    • comparison < based on ID (no recursive compares needed)

  • All commutative expressions have subterms ordered by ID

    • fixes x+y vs y+x problem (canonicity)

ManyWorlds

Summarized: DAG of canonical expressions

  • ManyWorlds unnests as late as possible
  • When unnesting, check whether reification var is already introduced for this canonical expression

+

  • simple invariants
  • memory efficient
  • strong simplification

- 

  • memory inefficient
    (keep all expressions in memory at all time)
  • expressions need to be immutable

Essence

Core idea: keep track of a sets of equivalent subexpressions with a canonical representative

Syntactical equivalence

  • easy to check
  • very limited

Semantical equivalence

  • very hard to check
  • very powerful

Easy semantical equivalences

  • simplification (evaluation): 1+x*y-1 is equivalent to x*y
  • ordering: x*y is equivalent to y*x
  • model equalities: a+b = x*y means a+b and x*y are equivalent
  • reifications (from flattening): c+x*y = 0 flattens to c+aux = 0 and aux = x*y

equivalent set:

{x*y, y*x, 1+x*y-1, a+b, aux}

Essence

Total order on expressions

  • order based on type
    • values < variables < equalities < ... < sums < ... < global constraints < ...
  • within same type, order recursively based on arguments
    • "x=3+z" < "x=z+3" < "3+z=x" < "z+3=x"

Smaller expressions are simpler!

Core idea: keep track of a sets of equivalent subexpressions with a canonical representative

Replace all syntactical occurrences of equivalent set with smallest element in set -> canonical representative

equivalent set:

{x*y, y*x, 1+x*y-1, a+b, aux}

Essence

Core idea: keep track of a sets of equivalent subexpressions with a canonical representative

+ strong reduction of number of variables/constraint
can avoid unnesting!

  • e.g., x*y = a and x*y+z = 0 reduces to a+z=0
  • e.g., x*y = a+b and x*y+z = 0 reduces to a+b+z=0

Replace all syntactical occurrences of equivalent set with canonical representative.

- more complex

  • e.g., what if a new canonical representative is found?

Can an expert modeler avoid duplicate subexpressions?

My interpretation from [Gent-Miguel-Rendl]:

Yes. E.g., most duplicates are from quantor unrolling.

\exists x, y\colon \varphi(x) \wedge\psi(x,y)
\varphi(x)

will introduce duplicate unrolled instantiations of

\exists x\colon \varphi(x) \wedge \exists y\colon \psi(x,y)

but the following equivalent model will not

Can an expert modeler avoid duplicate subexpressions?

My interpretation from [Gent-Miguel-Rendl]:

Yes. E.g., most duplicates are from quantor unrolling.

My opinion: No unless the expert generates the low-level constraints

Expecting expert modelers to anticipate and fix this is the same as expecting them to do the transformation to low level constraints.

p <-> alldiff(x,y,z)
p  -> alldiff(x,y,z)
~p -> ~alldiff(x,y,z)

E.g., CPMpy's uniform reification transforms

to the following, with common subexpressions,