# Compiling Constraints: formalization brainstorm

Jo Devriendt

• Problem statement
• Theory: Term Rewrite Systems
• Practice: Tailor, ManyWorlds, CPMpy

Disclaimer: ManyWorlds is a project in development with Nonfiction Software. This presentation cannot be used by KU Leuven to claim copyright on the ManyWorlds project.

## Problem statement

• IN: set of complex constraints
• described in some formal syntax
• OUT: set of simple constraints
• described by some solver input

Compilers!
[Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: principles, techniques, and tools. 1986.]

## Theory: Term Rewrite Systems

\text{constants } a,b,c,\ldots,t \\ \text{unary functor } \neg \\ \text{binary functor } \Rightarrow, \Leftrightarrow, \Leftarrow \\ \text{n-ary functors } \wedge, \vee
a\\ a \Rightarrow b\\ (a \Rightarrow b) \wedge (a \vee \neg b)\\ (a \Rightarrow b) \wedge (a \vee \neg b) \wedge c \wedge d\\

Vocabulary for propositional terms

Example terms

Term Algebra

• Vocabulary: set of functors $$f$$ with a given arity $$k \in \mathbb{N}$$
• in CPMpy: variables, values, operators, constraints
• A term is recursively defined as an application of $$f$$ to $$k$$ terms
• so the base case are functors with arity $$0$$, i.e., constants

## Theory: Term Rewrite Systems

Term Algebra

• Vocabulary: set of functors $$f$$ with a given arity $$k \in \mathbb{N}$$
• in CPMpy: variables, values, operators, constraints
• A term is recursively defined as an application of $$f$$ to $$k$$ terms
• so the base case are functors with arity $$0$$, i.e., constants

A term is a recursive structure (a tree):

(a \Rightarrow b) \wedge (a \vee \neg b)
\wedge
a
b
\neg
\Rightarrow
\vee
b
a

## Theory: Term Rewrite Systems

\begin{aligned} x \Leftrightarrow y &\rightarrow (x \Rightarrow y) \wedge (x \Leftarrow y)\\ x \Rightarrow y &\rightarrow \neg x \vee y\\ x \vee (y \vee z) &\rightarrow x \vee y \vee z\\ \neg \neg x &\rightarrow x\\ \ldots\\ x \vee \ldots \vee (y \wedge \ldots \wedge z) &\rightarrow (x \vee \ldots \vee y) \wedge \ldots \wedge (x \vee \ldots \vee z)\\ \end{aligned}

Rewrite Rules

• A rewrite rule is a pair of terms $$t_1 \rightarrow t_2$$
• Terms in $$t_1, t_2$$ can contain placeholder variables

## Theory: Term Rewrite Systems

(a \wedge b) \Rightarrow (c\wedge d)
\dfrac{x \Rightarrow y}{\neg x \vee y}\\
x \mapsto (a \wedge b)\\ y \mapsto (c \wedge d)
\neg(a \wedge b) \vee (c\wedge d)

Rewrite Rules

• A rewrite rule is a pair of terms $$t_1 \rightarrow t_2$$
• Terms in $$t_1, t_2$$ can contain placeholder variables
• Applying rewrite rules transforms the term into another one
• requires substitution of the placeholder variables with subterms

## Theory: Term Rewrite Systems

Rewrite Rules

• A rewrite rule is a pair of terms $$t_1 \rightarrow t_2$$
• Terms in $$t_1, t_2$$ can contain placeholder variables
• Applying rewrite rules transforms the term into another one
• requires substitution of the placeholder variables with subterms
• The term is in normal form if no rewrite rule applies
• Termination is guaranteed by defining a reduction order and showing that rules strictly simplify
• terms without $$\Leftrightarrow$$ are simpler
• terms without $$\Rightarrow,\Leftarrow$$ are simpler
• terms with pushed negations are simpler
• CNF terms are simplest
• requires rule applying distributivity of $$\vee$$ over $$\wedge$$

## Theory: Term Rewrite Systems

CPMpy can probably be elegantly modeled as a term rewrite system

• Caveat 1: some rules will introduce auxiliary terms
• separate term trees which will need to be rewritten later
• as long as these are strictly simpler than the input term, this should be ok
\neg \color{red}s \color{black} \vee d \vee e \vee f\\ \color{red}s \color{black} \vee \neg d \\ \color{red}s \color{black} \vee \neg e \\ \color{red}s \color{black} \vee \neg f \\

E.g., to avoid distributivity blowup, use Tseitinization

a \vee b \vee c \vee (d\wedge e \wedge f)
a \vee b \vee c \vee \color{red}{s}

## Theory: Term Rewrite Systems

CPMpy can probably be elegantly modeled as a term rewrite system

• Caveat 1: some rules will introduce auxiliary terms
• separate term trees which will need to be rewritten later
• as long as these are strictly simpler than the input term, this should be ok
• Caveat 2: different solvers have different built-in constraints, which need to be fall through
• e.g., min/max/abs for linearize
• conditional rules?
• different sets of rules for different solvers?

I don't know about constraint compilation systems (SMT, ASP, CP, IDP, ...) that formally describe their compilation step using term rewriting terminology.

I'm not the only one.

They may still exist though.

## Practice: Tailor

3.3.1 Preproccessing of syntax tree

• Normalisation: reduces equivalent representations to one unique representation - fewer cases for later transformations
• e.g., implies
• comprises expression evaluation (3+5+x -> 8+x), pushing of negations, flipping of comparisons
• "The types used in a problem instance must be adapted to the solver’s repertory." <- ?
• comprises conversion of sparse domains to bound domains

## Practice: Tailor

3.3.2 Flattening

"prior to flattening, every expression tree has been preprocessed such that its tree structure conforms to the propagators provided by solver S. [...] for every node N in E, there exists a propagator in solver S that corresponds to operation N [...] this preprocessing procedure can be embedded into flattening"

3.3.3 Solver Profiles

"a solver profile [...] captures important features of a particular solver. [...] an expression is only flattened, if the target solver does not support it."

• First translates, then flattens?
• CPMpy first flattens, then translates to solver

## Practice: ManyWorlds

1. Normalize
2. (Instantiate quantifications)
3. Simplify known terms
4. Push unary (negation, minus)
5. Merge n-ary terms
6. Flatten to tree of linear inequalities over int
7. Push minus again
8. Merge again
9. Linearize via big M.
• Also flattens as late as possible
• Any expression generated by any step can be compiled with later steps only <- strict reduction order
• Input language has 3 strictly separated primitive types
• bool, int, string
• Contains variables, values, operators, functors

## Practice: CPMpy

flatten_constraint
reify_rewrite
only_bv_implies
linearize_constraint
only_numexpr_equality
only_positive_bv
only_const_rhs
only_var_lhs
to_cnf
flat2cnf

For Exact:

For SAT:

Possible formalization approach:

• represent each transformation as a set of rewrite rules
• characterize terms by complexity
• show that each rewrite rule outputs terms of lesser complexity
• apply different rules for different solvers (e.g., do not apply transformation of min/max/abs when a solver supports it)

## Practice: CPMpy

Possible formalization approach:

• represent each transformation as a set of rewrite rules
• characterize terms by complexity
• show that each rewrite rule outputs terms of lesser complexity
• apply different rules for different solvers (e.g., do not apply transformation of min/max/abs when a solver supports it)

Rewrite systems do not formalize when to match a rule - i.e., what input a transformation expects. They just check whether a rule matches.

• clear description of transformations
• modular (!)

## Practice: CPMpy

\dfrac{\sum_i y_i \geq z \Rightarrow x^{bool} }{ \neg x^{bool} \Rightarrow \sum_i y_i < z}
\dfrac{\text{alldiff(x,y,z)}}{ x\neq y \wedge y\neq z \wedge z \neq y }
linearize_constraint
only_bv_implies

By Jo Devriendt

• 203