Data-Driven Equivalence Checking

第一个“能处理循环”的“x86汇编”的等价性检查器

Authors: Stanford University

Presenter: Xingyu Xie

Worked Example

非形式化地过一遍算法的关键点

1.

2.

Algorithm

算法上的细节

3.

Implementation & Experiment

实现上的细节以及实验结果

# Contents

# Worked Example

# Proof Goal

验证目标：

当

T 和 R 初始状态相同
T 终止

那么

R 也终止
R 和 T 的返回状态相同

# Cutpoint

Cutpoint

a pair of program points
chosen to divide the loops into loop-free segments
左图中的 a, b, c
T 和 R 会一起从某个 cutpoint 到下一个 cutpoint
在每个 cutpoint 都有某个不变式成立
- 在点 a, T 和 R 有相同的初始状态
- 在点 c, T 和 R 有相同的返回值（eax）

# Proof Obligation

Proof obligation

如果 T 和 R 从相同的初始状态出发，并且直接转移到 c，它们都会终止并且返回值相同。
如果 T 和 R 从相同的初始状态出发，并且转移到 b，它们会满足一个不变式 I。
如果 T 和 R 从满足 I 的 b 出发，并且回到 b，那么 I 仍然满足。
如果 T 和 R 从满足 I 的 b 出发，并且到达 c，那么它们都会终止并且返回值相同。
not enough...

# Code Paths Correspondence

Code Paths Correspondence

code paths：从一个 cutpoint 到另一个 cutpoint 的指令序列，不会跨过 cutpoint
A code path Pa of T corresponds to a code path Pb of R：如果它们开始和结束的 cutpoint 都相同，而且如果 T 会执行路径 Pa，R 也会执行路径 Pb

# 2 Crucial Questions

2 Crucial Questions:

Correspondences between code paths
Invariant inference

Solution: data-driven!

Correspondences：分析在相同测例上的 execution trace
Invariant: 由等式的合取构成，其中的变量涉及程序的活跃寄存器，分析在测例集上的 execution snapshot

# Invariant Inference

记录活跃寄存器的值的矩阵：

不同的行代表不同的测例
不同的列代表不同的寄存器

从中可以发现的等式：

eax = eax'
5 * esi = ecx'
edi = edx'
ecx = esi'
...

有可能会发现伪不变式，会送给 SMT solver 来检验。

Algorithm

Generate cutpoints & corresponding paths

1.

2.

Generate Invariants

3.

Checking Proof Obligations

# Algorithm

# Cutpoint

Generate Cutpoints

cutpoint: a pair of program points

generation: data-driven

要求：
- 通过程序点的次数呈线性关系
- 在不同的测例中，堆上只有常数个位置的值不同
step 1：依两个内存状态相同的值的数量从大到小选择 cutpoint，直到两个程序都 loop-free。
step 2：如果有 cutpoint 被删掉之后，这两个程序还能保持 loop-free，那就把多余的 cutpoint 删掉。
repeat：如果有多种最小的 cutpoint 集合可选，我们会去尝试每一个。

# Correspondence

Generate Correspondence between code paths

Data-driven：运行测例，从实际的 execution traces 中找到 corresponding paths
添加两条 proof obligation 给 SMT solver 去验证：
- 对于 T 中两个 cutpoint 之间没有测例执行过的路径，我们任意选取一条在 R 上同样的两个 cutpoint 之间的路径。
- 如果 T 执行路径 p，那么 R 只能执行 p 的 corresponding paths 中的某一条。
如果这些 proof obligations 验证失败的话，可以通过反例来发现新的 corresponding paths。

# Invariant

Generate Invariants

包含的变量：寄存器、栈位置（假设是有界的）和有限多的堆位置
不变式的形式：若干等式的合取
考虑由一个 cutpoint 的在测例集上的快照所构成的矩阵，我们计算其零空间的一组基，每一个基向量对应一个等式。
- A 的零空间就是所有满足 Ax = 0 的向量 x 所张成的空间。
比如说下面的矩阵，一组基向量是 [1, -1, 0] 和 [0, 1, -1]，它们对应的等式就是
- eax * (1) + ebx * (-1) + eax' * 0 = 0，即 eax = ebx，
- 和 eax * 0 + ebx * (1) + eax' * (-1) = 0，即 ebx = eax'
这种做法的好处：不会有正确的等式关系被遗漏。

# Checking Proof Obligations

Checking Proof Obligations

将程序编码为 SMT 公式，喂给 SMT solver。
理论：quantifier-free theory of bit-vectors
包括三类基本的 proof obligation：
- {E} <t, r> {Q}，E 代表两个状态等价。这里的 corresponding paths 是从 T 和 R 的开头到某一个 cutpoint，Q 是这个 cutpoint 上的不变式。
- {P} <t, r> {Q}，t 和 r 是一对 corresponding paths，它们从 cutpoint n1 出发到 cutpoint n2 结束。P 是 n1 的不变式，Q 是 n2 的不变式。
- {P} <t, r> {F}，t 和 r 是从某个以 P 为不变式的 cutpoint 出发，到一条返回指令结束的路径对，F 表示返回值和内存状态相同。
如果验证失败，我们会把反例集成进相应的矩阵中，尝试重新推导不变式。

Liveness Computation
Testcase Generation
Tracing
Invariant Generation
VC Generation

Implementation: DDEC

# Liveness Computation

Liveness Computation

由于 x86 会有 register aliasing，我们需要对标准的数据流算法作修改。

当读一个寄存器的时候，我们认为其读了所有的 sub-register。
当写一个寄存器的时候，我们认为其写了所有的 super-register。

# Testcase Generation

Testcase Generation

对于没有内存读写的程序，可以自动生成测例：只需对所有的 live-in registers 赋一个随机值即可。
对于有内存读写的程序，就需要用户指定测例。

# Tracing

Tracing

在每一条指令运行前后都会记录程序状态（record）
假定 T 在测例集上不会崩溃，但 R 是可能崩溃的（sandbox）
我们会用 T 的信息来简化 R 的运行，比如限制栈大小、堆的上下界（hmin/hmax）、循环次数（sandbox_jump）。

# Invariant Generation

Invariant Generation

feature：register, stack location and a finite set of heap locations
栈位置：考虑成临时变量
如果最长的 live feature 是 x bits，对于每一个 x bits 的 feature，我们会搞一列；对于每一个低于 x bits 的 feature，我们会将其扩展至 x bits，由于有 zero-extended 和 signed-extended 两种扩展方式，我们会搞两列。

# VC Generation

VC Generation

SMT solver: Z3
不支持浮点数：IEEE 754 的浮点数与 SMT 中的 real theory 还是不同的，比如说 IEEE 754 的浮点数没有加法结合律。
由于 x86 缺乏 formal semantics，需要手动将 x86 的汇编 encode 成 Z3 公式，这个 encoding 的正确性只能通过测试来保证。
- 比如 popcnt，Intel 只提供了用循环来非形式化描述的语义。
- 对于有的指令，硬件实现与标准不同，此时我们编码了所观察到的硬件行为和规范描述的上近似。
内存被建模成了两个向量：64-bit 的向量和 8-bit 的向量
对于一些昂贵的操作（比如乘法和除法），我们将其考虑成未解释函数
一个难点：spill slot，将其考虑成特殊的临时变量

DDEC v.s. equality saturation
OpenSSL
CompCert and gcc
STOKE

Experiments

# Equality Saturation

DDEC v.s. equality saturation

equality saturation 依赖于 expert rule，而对于像 x86 的 CISC 指令集，确定这些规则是非常一个令人望而却步的任务，所以 equality saturation 实际上只能应用于高级或者中间语言。
equality saturation 处理不了像是用 loop unrolling 优化的程序（Figure 3）。
对于语义上不同的程序，equality saturation 不保证终止，而 DDEC 总会终止于一个反例，但 DDEC 找到的反例很难对应到源代码级别（Figure 4）。

# CompCert and gcc

CompCert and gcc

验证用 CompCert 编译出来的汇编程序和用 gcc 编译出来汇编程序是等价的，相当于可以对 CompCert 作加速，或者说可以确保 gcc 的正确性。
发现有的程序确实是需要不等式作为不变式，比如下面的程序需要 i=i' /\ n=n' /\ i<=n

# STOKE

STOKE

cost function 中的 performance term 需要更新如下，nd() 是 loop nesting depth，w 是一个常数（我们设为 20）。

Data-Driven Equivalence Checking

By Xingyu Xie

Data-Driven Equivalence Checking

Data-Driven Equivalence Checking

Contents

1.

2.

3.

Algorithm

1.

2.

3.

Generate Cutpoints

Generate Correspondence between code paths

Generate Invariants

Checking Proof Obligations

Implementation: DDEC

Liveness Computation

Testcase Generation

Tracing

Invariant Generation

VC Generation

Experiments

DDEC v.s. equality saturation

CompCert and gcc

STOKE

Data-Driven Equivalence Checking

More from Xingyu Xie