Compiling Constraints: formalization brainstorm

Jo Devriendt

• Problem statement
• Theory: Term Rewrite Systems
• Practice: Tailor, ManyWorlds, CPMpy

Disclaimer: ManyWorlds is a project in development with Nonfiction Software. This presentation cannot be used by KU Leuven to claim copyright on the ManyWorlds project.

Problem statement

• IN: set of complex constraints
• described in some formal syntax
• OUT: set of simple constraints
• described by some solver input

Compilers!
[Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: principles, techniques, and tools. 1986.]

Theory: Term Rewrite Systems

\text{constants } a,b,c,\ldots,t \\ \text{unary functor } \neg \\ \text{binary functor } \Rightarrow, \Leftrightarrow, \Leftarrow \\ \text{n-ary functors } \wedge, \vee
a\\ a \Rightarrow b\\ (a \Rightarrow b) \wedge (a \vee \neg b)\\ (a \Rightarrow b) \wedge (a \vee \neg b) \wedge c \wedge d\\

Vocabulary for propositional terms

Example terms

Term Algebra

• Vocabulary: set of functors $$f$$ with a given arity $$k \in \mathbb{N}$$
• in CPMpy: variables, values, operators, constraints
• A term is recursively defined as an application of $$f$$ to $$k$$ terms
• so the base case are functors with arity $$0$$, i.e., constants

Theory: Term Rewrite Systems

Term Algebra

• Vocabulary: set of functors $$f$$ with a given arity $$k \in \mathbb{N}$$
• in CPMpy: variables, values, operators, constraints
• A term is recursively defined as an application of $$f$$ to $$k$$ terms
• so the base case are functors with arity $$0$$, i.e., constants

A term is a recursive structure (a tree):

(a \Rightarrow b) \wedge (a \vee \neg b)
\wedge
a
b
\neg
\Rightarrow
\vee
b
a

Theory: Term Rewrite Systems

\begin{aligned} x \Leftrightarrow y &\rightarrow (x \Rightarrow y) \wedge (x \Leftarrow y)\\ x \Rightarrow y &\rightarrow \neg x \vee y\\ x \vee (y \vee z) &\rightarrow x \vee y \vee z\\ \neg \neg x &\rightarrow x\\ \ldots\\ x \vee \ldots \vee (y \wedge \ldots \wedge z) &\rightarrow (x \vee \ldots \vee y) \wedge \ldots \wedge (x \vee \ldots \vee z)\\ \end{aligned}

Rewrite Rules

• A rewrite rule is a pair of terms $$t_1 \rightarrow t_2$$
• Terms in $$t_1, t_2$$ can contain placeholder variables

Theory: Term Rewrite Systems

(a \wedge b) \Rightarrow (c\wedge d)
\dfrac{x \Rightarrow y}{\neg x \vee y}\\
x \mapsto (a \wedge b)\\ y \mapsto (c \wedge d)
\neg(a \wedge b) \vee (c\wedge d)

Rewrite Rules

• A rewrite rule is a pair of terms $$t_1 \rightarrow t_2$$
• Terms in $$t_1, t_2$$ can contain placeholder variables
• Applying rewrite rules transforms the term into another one
• requires substitution of the placeholder variables with subterms

Theory: Term Rewrite Systems

Rewrite Rules

• A rewrite rule is a pair of terms $$t_1 \rightarrow t_2$$
• Terms in $$t_1, t_2$$ can contain placeholder variables
• Applying rewrite rules transforms the term into another one
• requires substitution of the placeholder variables with subterms
• The term is in normal form if no rewrite rule applies
• Termination is guaranteed by defining a reduction order and showing that rules strictly simplify
• terms without $$\Leftrightarrow$$ are simpler
• terms without $$\Rightarrow,\Leftarrow$$ are simpler
• terms with pushed negations are simpler
• CNF terms are simplest
• requires rule applying distributivity of $$\vee$$ over $$\wedge$$

Theory: Term Rewrite Systems

CPMpy can probably be elegantly modeled as a term rewrite system

• Caveat 1: some rules will introduce auxiliary terms
• separate term trees which will need to be rewritten later
• as long as these are strictly simpler than the input term, this should be ok
\neg \color{red}s \color{black} \vee d \vee e \vee f\\ \color{red}s \color{black} \vee \neg d \\ \color{red}s \color{black} \vee \neg e \\ \color{red}s \color{black} \vee \neg f \\

E.g., to avoid distributivity blowup, use Tseitinization

a \vee b \vee c \vee (d\wedge e \wedge f)
a \vee b \vee c \vee \color{red}{s}

Theory: Term Rewrite Systems

CPMpy can probably be elegantly modeled as a term rewrite system

• Caveat 1: some rules will introduce auxiliary terms
• separate term trees which will need to be rewritten later
• as long as these are strictly simpler than the input term, this should be ok
• Caveat 2: different solvers have different built-in constraints, which need to be fall through
• e.g., min/max/abs for linearize
• conditional rules?
• different sets of rules for different solvers?

I don't know about constraint compilation systems (SMT, ASP, CP, IDP, ...) that formally describe their compilation step using term rewriting terminology.

I'm not the only one.

They may still exist though.

Theory: Term Rewrite Systems

Edit: yes, MiniZinc's predecessor Cadmium!

Practice: Tailor

3.3.1 Preproccessing of syntax tree

• Normalisation: reduces equivalent representations to one unique representation - fewer cases for later transformations
• e.g., implies
• comprises expression evaluation (3+5+x -> 8+x), pushing of negations, flipping of comparisons
• Type adaptation
• "The types used in a problem instance must be adapted to the solverâ€™s repertory." <- ?
• comprises conversion of sparse domains to bound domains

Practice: Tailor

3.3.2 Flattening

"prior to flattening, every expression tree has been preprocessed such that its tree structure conforms to the propagators provided by solver S. [...] for every node N in E, there exists a propagator in solver S that corresponds to operation N [...] this preprocessing procedure can be embedded into flattening"

3.3.3 Solver Profiles

"a solver profile [...] captures important features of a particular solver. [...] an expression is only flattened, if the target solver does not support it."

• First translates, then flattens?
• CPMpy first flattens, then translates to solver

Practice: ManyWorlds

1. Normalize
2. (Instantiate quantifications)
3. Simplify known terms
4. Push unary (negation, minus)
5. Merge n-ary terms
6. Flatten to tree of linear inequalities over int
7. Push minus again
8. Merge again
9. Linearize via big M.
• Also flattens as late as possible
• Any expression generated by any step can be compiled with later steps only <- strict reduction order
• Input language has 3 strictly separated primitive types
• bool, int, string
• Contains variables, values, operators, functors

Practice: CPMpy

flatten_constraint
reify_rewrite
only_bv_implies
linearize_constraint
only_numexpr_equality
only_positive_bv
only_const_rhs
only_var_lhs
to_cnf
flat2cnf

For Exact:

For SAT:

Possible formalization approach:

• represent each transformation as a set of rewrite rules
• characterize terms by complexity
• show that each rewrite rule outputs terms of lesser complexity
• apply different rules for different solvers (e.g., do not apply transformation of min/max/abs when a solver supports it)

Practice: CPMpy

Possible formalization approach:

• represent each transformation as a set of rewrite rules
• characterize terms by complexity
• show that each rewrite rule outputs terms of lesser complexity
• apply different rules for different solvers (e.g., do not apply transformation of min/max/abs when a solver supports it)

Disadvantage:
Rewrite systems do not formalize when to match a rule - i.e., what input a transformation expects. They just check whether a rule matches.

Advantages:

• clear description of transformations
• modular (!)

Practice: CPMpy

\dfrac{\sum_i y_i \geq z \Rightarrow x^{bool} }{ \neg x^{bool} \Rightarrow \sum_i y_i < z}
\dfrac{\text{alldiff(x,y,z)}}{ x\neq y \wedge y\neq z \wedge z \neq y }
linearize_constraint
only_bv_implies