From 91977ff499ca246e10bf54ce553a6269a85109a5 Mon Sep 17 00:00:00 2001 From: Etienne Kneuss <ekneuss@gmail.com> Date: Wed, 29 Apr 2015 19:23:03 +0200 Subject: [PATCH] Some synthesis stuff --- doc/synthesis.rst | 183 +++++++++++++++++++++++++++++++++++++--------- 1 file changed, 150 insertions(+), 33 deletions(-) diff --git a/doc/synthesis.rst b/doc/synthesis.rst index 40499d6c3..b70298a9b 100644 --- a/doc/synthesis.rst +++ b/doc/synthesis.rst @@ -11,57 +11,77 @@ consists of translating these specifications (or expectations) to executable programs which realises them. As we have seen previously, relatively precise specifications of complex -operations can be written consicely. Synthesis can thus reduce development time +operations can be written concisely. Synthesis can thus reduce development time and allows the user to focus on high-level aspects. Deductive Synthesis Framework ----------------------------- -Closing Rules -------------- +The synthesizer takes a synthesis problem that it extracts from ``choose`` or +holes (``???``) found in the program. Formally, we define a **synthesis problem** as -Ground -^^^^^^ +.. math:: -This rule applies when the synthesis problem has no input variables. If the -specification is satisfiable, its model corresponds to a valid solution. We -rely on SMT solvers to check satisfiability of the formulas. For instance: + [[ ~~ \bar{a} ~~ \langle ~~ \Pi \rhd \phi ~~ \rangle ~~ \bar{x} ~~ ]] + + +which carries the following information: + + * a set of input variables :math:`\bar{a}`, initially all variables in scope of the ``choose``, + * a set of output variables :math:`\bar{x}`, corresponding to the values to synthesize, + * a path-condition :math:`\Pi`, constraining :math:`\bar{a}`, + * the specification :math:`\phi` relating input variables to output variables. + +The job of the synthesizer is to convert this problem into a solution :math:`[ ~ P ~ | ~ T ~ ]`, which consists of + + * a executable program term :math:`T` (an expression that may contain input variables), + * a precondition :math:`P` under which this program is valid + +To illustrate, we consider the following program: .. code-block:: scala - choose ((x: Int, y: Int) => x > 42 && y > x) + def foo(a: BigInt, b: BigInt): Int = { + if (a > b) { + choose ( (x: BigInt) => x > a) + } else { + 0 + } + } -can trivially be synthesized by ``(1000, 1001)``. -If the specification turns out to be UNSAT, the synthesis problem is impossible -and we synthesize it as an ``error`` with a ``false`` precondition. +From this program, we will extract the following initial synthesis problem: +.. math:: -Optimistic Ground -^^^^^^^^^^^^^^^^^ + [[ ~~ a, b ~~ \langle ~~ a > b ~ \rhd ~ x > a ~~ \rangle ~~ x ~~ ]] -This rule acts like `Ground`, but without the requirement on the absence of input -variables. The reasonning is that even though the problem has input variables, -the solutions might still be a constant expression. +A possible solution to this problem would be: -`Optimistic Ground` also tries to satisfy the specification, but it also needs -to validate the resulting model. That is, given a valuation of output -variables, it checks whether it exists a valuation for input variables such that -the specification is violated. The model is discarded if such counter-example -is found. If no counter-example exist, we solve the synthesis problem with the -corresponding values. +.. math:: -The rule tries at most three times to discover a valid value. + [ ~ \top ~ | ~ a + 1 ~ ] -CEGIS -^^^^^ +To solve this problem, the synthesizer will apply a series of rules which will +try to either + + 1. Decompose this problem into sub problems, which may be simpler to solve + 2. Immediately find a solution + + +This corresponds to an and-or graph of rule applications, which Leon will +explore. -TEGIS -^^^^^ Decomposing Rules ----------------- +Leon defines several rules that decompose a synthesis problem (a ``choose``) +into sub-problems that may be simpler to solve. Such rules also define a way to +generate a solution for the original problem, when provided with solutions for +all of the sub-problems. These rules thus both decompose the problems and +recompose the solutions. Leon defines several of such decomposing rules: + Equality Split ^^^^^^^^^^^^^^ @@ -84,7 +104,7 @@ this rule generates two sub-chooses, and combines them as follows: } -Inequality Split +Inequalite Split ^^^^^^^^^^^^^^^^ Given two input variables `a` and `b` of numeric type, this rule @@ -139,7 +159,7 @@ substituted by the appropriate case, and combines them as follows: Int Induction ^^^^^^^^^^^^^ -Given an integer (or bigint) variable `a`, the rules performs induction on `a`: +Given an Int (or BigInt) variable `a`, the rules performs induction on `a`: .. code-block:: scala @@ -238,14 +258,14 @@ redundancies. We consider two kinds of equivalences: 1) Simple equivalences: the specification contains :math:`a = b` at the top level. - 2) ADT equialences: the specification contains :math:`l.isInstanceOf[Cons] + 2) ADT equivalence the specification contains :math:`l.isInstanceOf[Cons] \land h = l.head \land t = l.tail` which entails :math:`l = Cons(h, t)` and thus allows us to substitute :math:`l` by :math:`Cons(h, t)`. -Eliminating equivalences prevents explosion of redundant rule instanciations. +Eliminating equivalences prevents explosion of redundant rule instantiations. For instance, if you have four integer variables where three of them are equivalent, Leon has 6 ways of applying `Inequality Split`. After -eliminating equivalences, only one application remains posible. +eliminating equivalences, only one application remains possible. Unused Input ^^^^^^^^^^^^ @@ -287,3 +307,100 @@ define it. DetupleOutput, DetupleInput, InnerCaseSplit + +Closing Rules +------------- + +While decomposing rules split problems in sub-problems, Leon also defines rules +that are able to directly solve certain synthesis problems. These rules are +crucial for the synthesis search to terminate efficiently. We define several +closing rules that apply in different scenarios: + +Ground +^^^^^^ + +This rule applies when the synthesis problem has no input variables. If the +specification is satisfiable, its model corresponds to a valid solution. We +rely on SMT solvers to check satisfiability of the formulas. For instance: + +.. code-block:: scala + + choose ((x: Int, y: Int) => x > 42 && y > x) + +can trivially be synthesized by ``(1000, 1001)``. + +If the specification turns out to be UNSAT, the synthesis problem is impossible +and we synthesize it as an ``error`` with a ``false`` precondition. + + +Optimistic Ground +^^^^^^^^^^^^^^^^^ + +This rule acts like `Ground`, but without the requirement on the absence of input +variables. The reasoning is that even though the problem has input variables, +the solutions might still be a constant expression. + +`Optimistic Ground` also tries to satisfy the specification, but it also needs +to validate the resulting model. That is, given a valuation of output +variables, it checks whether it exists a valuation for input variables such that +the specification is violated. The model is discarded if such counter-example +is found. If no counter-example exist, we solve the synthesis problem with the +corresponding values. + +The rule tries at most three times to discover a valid value. + +CEGIS +^^^^^ + +`CEGIS` stands for Counter-Example-Guided Inductive Synthesis, it explores the +space of small expressions to find valid solutions. Here we represent the space +of programs by a tree, where branches are determined by free boolean variables. +For instance, a tree for small integer operations could be: + +.. code-block:: scala + + def res(b, a1, a2) = if (b1) 0 + else if (b2) 1 + else if (b3) a1 + else if (b4) a2 + else if (b5) c1(b, a1, a2) + c2(b, a1, a2) + else c1(b, a1, a2) * c2(b, a1, a2) + + def c1(b, a1, a2) = if (b7) 0 + else if (b8) 1 + else if (b9) a1 + else a2 + + def c2(b, a1, a2) = if (b10) 0 + else if (b11) 1 + else if (b12) a1 + else a2 + +At a high-level, it consists of the following loop: + + 1. Find one expression and inputs that satisfies the specification: + :math:`\exists \bar{b}, a1, a2. spec(a1, a2, res(\bar{b}, a1, a2))`. + If this fails, we know that the solution is not in the search space. + If this succeeds, we: + + 2. Validate the expression represented by :math:`M_\bar{b}` for all inputs by + searching for a counter-example: :math:`\exists a1, a2. \lnot spec(a1, a2, res(M_\bar{b}, a1, a2))`. + If such counter-example exists, start over with (1) with this program + excluded. If no counter-example exists we found a valid expression. + +The space of expressions our CEGIS rule considers is small expressions of +bounded depth (3), which contain for each type: a few literals, functions and +operations returning that type that do not transitively call the function under +synthesis (to prevent infinite loops), and recursive calls where one argument is +decreasing. + + +TEGIS +^^^^^ + +This rule uses the same search space as `CEGIS` but relies only on tests +(specified by the user or generated) to validate expressions. It is thus a +generally faster way of discovering candidate expressions. However, these +expressions are not guaranteed to be valid since they have only been validated +by tests. Leon's synthesis framework supports untrusted solutions which trigger +an end-to-end validation step that relies on verification. -- GitLab