Making Deduction More Effective in SAT Solvers
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 29, NO. 8, AUGUST 2010
AbstractSatisfiability (SAT) solvers often benefit from transformations of the formula to be decided that allow them to do more through deduction and decrease their reliance on enumeration. For formulae in conjunctive normal form, subsumed clauses may be removed or partial resolution may be applied. The objectives of simplifying the formula and speeding up the solver are sometimes competing. We characterize existing transformations in terms of their impact on the deductive power of the formula and their effects on the sizes of the implication graphs. For example, we show that variable elimination works by improving implication graphs. We also present two new techniques that try to increase deductive power. The first is a check performed during the computation of resolvents. The second is a new preprocessing algorithm based on distillation that combines simplification and increase of deductive power. Most current SAT solvers apply resolution at various stages to derive new clauses or simplify existing ones. The former happens during conflict analysis, while the latter is usually done during preprocessing. We show how subsumption of the operands by the resolvent can be inexpensively detected during resolution; we then discuss how this detection is used to improve three stages of the SAT solver: variable elimination, clause distillation, and conflict analysis. The “on-the-fly” subsumption check is easily integrated in a SAT solver. In particular, it is compatible with strong conflict analysis and the generation of unsatisfiability proofs. Experiments show the effectiveness of the new techniques. Index Terms—CNF, distillation, DPLL, preprocessing, SAT.
可滿足性(SAT)求解器通常受益於要確定的公式的轉換,這允許他們透過演繹來做更多的事情,並減少對列舉的依賴。 對於合取正規化的公式,可以去掉包含子句或採用部分分解。簡化公式和加快求解速度的目標有時是相互矛盾的。 我們根據它們對公式的演繹能力的影響以及它們對蘊涵圖大小的影響來描述現有的變換。例如,我們透過改進隱含圖來證明變數消除是有效的。我們還提出了兩種新技術,試圖提高演繹能力。第一個是在求解過程中執行的檢查。二是將簡化與提高演繹能力相結合的基於精餾的預處理演算法。 目前大多數SAT解算器在不同階段採用分解來派生新的條款或簡化現有條款。前者發生在衝突分析過程中,後者通常在預處理過程中進行。我們展示瞭如何在解析過程中低成本地檢測解析器對運算元的包含;然後,我們討論瞭如何使用這種檢測來改進SAT求解器的三個階段:變數消除、子句蒸餾和衝突分析。“即時”的包容檢查很容易整合在SAT求解器中。特別地,它相容強衝突分析和不滿意證明的生成。實驗證明了新技術的有效性。索引術語:cnf,蒸餾,DPLL,預處理,SAT。 |
|
I. IntroductionT HE LAST two decades have seen great advances in the performance of satisfiability (SAT) solvers for propositional logic, in particular those based on the David–Putnam– Logemann–Loveland (DPLL) procedure [1]–[5]. These solvers have evolved in symbiotic relationship with many EDA applications including model checking, logic synthesis, testing, and timing analysis. Progress has been made both in the pruning of the search space [3] and in the efficient implementation of the basic operations, like deductions [4]. Here we are concerned with techniques that transform a conjunctive normal form (CNF) formula, either as a preprocessing step [6]–[8] or during the DPLL procedure. These transformations should be relatively inexpensive and produce formulae on which the DPLL procedure runs faster than on the original ones. Reducing the size of the formula is a common objective of transformations. For instance, a set of clauses is redundant if a proper subset represents the same function. A subsumed clause (i.e., a clause implied by another) is redundant, and the cost of many SAT solver operations decreases with a smaller formula. Hence, removing subsumed clauses is usually beneficial. However, not all redundant clauses can be removed without negative effect on the speed of the solver. We introduce two notions that help in the design and evaluation of formula transformations. The first is deductive power of a CNF formula. The higher this power, the more consequences the DPLL procedure can deduce from each of its decisions; hence, the more effective is the pruning of the search space. The second notion is proof conciseness. It reflects the fact that the DPLL procedure progresses through the search space by proving that parts of that space contain no satisfying assignment and recording such findings in the form of new clauses and their derivations. More concise proofs are faster to build and usually more effective at pruning further search. To see how deductive power may help in the analysis of SAT solvers, consider clause recording, which adds conflictlearned clauses or, simply, conflict clauses to the original SAT instance. Each conflicting assignment is analyzed to identify a subset that is sufficient to cause the current conflict. The disjunction of the literals in the subset becomes a new clause added to the original SAT instance. The conflict clauses learned by SAT solvers are by definition redundant, but they always improve the deductive power of a CNF formula. Clauses that are subsumed by other clauses slow down the implication process, but do not help the solver in pruning the search space. We show that they never improve deductive power. Therefore, preprocessing often removes them to accelerate implications. On the other hand, removing literals from clauses may increase the deductive power of a formula. We study in detail several approaches to such elimination, both as preprocessing and during DPLL. Literal removal procedures are often based on resolution. In addition, resolution may be applied to eliminate variables from the formula. Since the elimination of variables may increase the number of clauses, it is usually applied with restraint [6], [9]. Deductive power is not guaranteed to improve either. Instead, the main benefit of variable elimination is the decrease in the average number of decisions and implications required to produce a conflicting assignment. Not only conflicts occur sooner, but their analysis is faster, and the learned clauses tend to prune larger portions of the search space.
In this paper, we analyze existing techniques that increase deductive power or generate more concise implication graphs and we propose two new ones. We show how to detect subsumptions during resolution during both preprocessing and conflict analysis with minimal overhead. Our on-the-fly subsumption check can be applied to both regular and strong [10] conflict analysis. We show how this inexpensive check is used to improve deductive power at three stages of the SAT solver: variable elimination, clause distillation, and conflict analysis. We then describe a distillation algorithm that asserts the negations of clauses to remove redundant literals and possibly derive new clauses. Unlike previous approaches, this distillation procedure may replace a clause with the resolvent of two or more existing clauses without explicitly deriving any such resolvents in advance. We show that distillation increases deductive power and shortens implication graphs. Experiments show that the presented techniques speed up our SAT solver. Variable elimination works primarily by shortening the implication graphs, while other transformations mainly improve deductive power. This paper combines and extends [11] and [12]. It is organized as follows. Section II discusses related work. Section III covers background. In Section IV we introduce and characterize the notion of deductive power of a CNF formula. In Section V we describe our on-the-fly simplification based on self-subsumption during conflict analysis and present the details of the algorithm. Section VI describes our distillationbased approach. Section VII reports results from a prototype implementation. We draw conclusions and outline future work in Section VIII. |
|
II. Related WorkA problem related to preprocessing of a CNF formula is the preprocessing of conflict clauses in an incremental SAT solver. An incremental solver is given a sequence of SAT instances and tries to use clauses learned in earlier instances to expedite the solution of later instances. If each instance is obtained from the previous by addition of new clauses, all clauses learned by the solver can be forwarded to the new instance. However, in the general case, clauses must be validated before they can be forwarded. In [13], a process called distillation was proposed, which forwards a clause derived from a previously learned clause γ only if asserting the negation of γ causes a conflict in the new instance. In [11] and in this paper, we apply distillation to preprocessing the original clauses of a CNF formula and we characterize this approach from the point of view of deductive power. Assignment shrinking [14] can also be seen as on-the-fly distillation of selected conflict clauses. At the end of conflict analysis, the algorithm of [14] backtracks to a level preceding the backtracking level to undo some assignments in the conflict clause. It then applies those assignments again in a different order until a new conflict occurs. This may produce a new smaller conflict clause. Since this is a potentially expensive technique, its invocation is controlled by a heuristic. Previous work besides [14] has addressed the quality of conflict clauses [5], [7], [10], [15], [16]. In particular, the clause minimization algorithm of [7] and [16] traverses the implication graph beyond the 1-unique implication point (UIP) to remove literals in the conflict clause that are implied by other literals. The strong conflict analysis proposed in [10] generates a second conflict clause that is often more effective than a regular conflict clause of [15] in escaping regions of the search space where the solver would otherwise linger for a long time. A common thread of most work on the subject is the search for a balance between a technique’s cost and its ability to detect implications earlier. Unlike the on-the-fly subsumption to be discussed in Section V, these earlier techniques focus on simplification of the conflictlearned clauses, instead of looking at all clauses appearing in the resolution graph. An existing clause may be subsumed by a conflict clause newly found by any of the conflict analysis algorithms. Hence, one may try to simplify the newly redundant clauses. The on-the-fly simplification algorithm used in [17] can detect the subsumed clause with a one watched literal scheme, when a new clause is generated by conflict analysis. While the one watched literal scheme is efficient, the removal of subsumed clauses does not improve deductive power and does not produce more concise proofs. The practical ability of this technique to speed up SAT solvers was not the focus of [17] and remains to be established. |
|
III. Preliminaries
|
|
IV. Deductive Power of a CNF Formula
|
|
V. On-The-Fly Self-Subsumption
|
|
|
|
|
VII. Experimental Results
We have presented techniques that aim at increasing the deductive power of a CNF formula and promoting more concise implication graphs. In order to evaluate them, we have implemented a preprocessor on top of the CNF SAT solver CirCUs 2.0 [23], [24], which applies variable elimination, the distillation procedure of Section VI, named Alembic, and simplification based on subsumption and self-subsumption as in [6]. We have also implemented the three applications of onthe-fly clause simplification discussed in this paper, namely, to variable elimination and conflict analysis in Alembic as well as to conflict analysis in CirCUs. In variable elimination, an increase in the average length of the clauses is detrimental for deductive power. Hence, in our implementation, only variables whose elimination does not cause such an increase are eliminated. Since SAT solvers often need to provide either a satisfying assignment or a proof of unsatisfiability, clauses that are either removed or simplified are set aside just as the derivations of conflict clauses [25], [26]. The SAT solver CirCUs only needs these clauses to recover a complete solution (for a satisfiable instance), or to produce a proof of unsatisfiability in terms of the original clauses. This scheme requires extra memory, but its effect on speed is negligible. The benchmark suite is composed of all the CNF instances (with no duplicates) from the industrial category of the SAT Races of 2006 and 2008, and the SAT Competitions of 2007 and 2009 [27]. We conducted the experiments on a 2.4 GHz Intel Core2 Quad processor with 4 GB of memory. We used 10,000 s as timeout, and 2 GB as memory bound. We tested MiniSat 2.0 [19] and PrecoSAT 236 [28] along with CirCUs 2.0 to provide reference points. The plot of Fig. 19 shows how many instances are solved by selected solvers within a given time bound. Our variable elimination algorithm is named EV; Alembic is abbreviated AL, EVAL stands for EV + AL, and OCI denotes the onthe-fly clause improvement described in Section V. Fig. 19 shows the CPU time taken by CirCUs (with various subsets of the proposed approaches), MiniSat, and PrecoSAT. Both MiniSat and PrecoSAT use their own preprocessors [6]. Fig. 19 confirms that CirCUs is comparable to state-of-the-art SAT solvers, and that its performance is significantly improved by applying all the proposed approaches (i.e., EVAL + OCI). The scatterplots of Fig. 20 examine the effects of the proposed techniques on deductive power and size of implication graphs, by showing the changes in CPU time, numbers of decisions, average numbers of resolution steps per conflict analysis, and average length of conflict-learned clauses. For each of these quantities the geometric mean of the new/old ratios is reported (excluding cases in which one of the values is 0). Single-sample t-tests were performed to confirm the statistical significance of the data. The null hypothesis was that the mean of the logarithms of the ratios is 0. The alternative hypothesis is two-sided. Since the data that are compared span several orders of magnitudes, differences and ratios may paint very different pictures of the experiments. Analyzing the ratios puts equal emphasis on short and long-running instances. This is partly compensated by the scatterplots and the views in Fig. 19, which highlight the ability of the improved procedure to complete more instances in the allotted time. A marked decrease in the numbers of decisions confirms that the proposed techniques allow the SAT solver to rely more on deduction and less on search. The reduction in resolution steps confirms that the implication graphs are, on average, significantly smaller. As a result, shorter clauses are learned. For lack of space, we omit scatterplots illustrating the effects of individual techniques. They would show that variable elimination is the main cause for the smaller implication graphs, and that it also tends to reduce the number of decisions and shorten the learned clauses. Distillation alone decreases the numbers of decisions (as one would expect of a technique addressing deductive power) and shortens learned clauses, but has limited effect on the sizes of the implication graphs. Its effect on memory consumption proves negligible. Variable elimination interacts in an interesting way with OCI. This is shown in Fig. 21, where the numbers of on-thefly subsumptions per resolution step during DPLL are seen to increase significantly when EV is applied. The following example sheds light on this phenomenon. Example 14: Consider the following clauses:
Fig. 23 shows that the conflict clause subsumes c2. (It also subsumes c6, but this is not detected by the algorithm.) This time there are fewer resolution steps, and this “abridgment” of the process allows the subsumed clause to enter the analysis right before the subsuming resolvent is computed instead of several steps before. We now report statistics on the performance of the preprocessors. Fig. 24 compares the speed of various versions of EVAL to SatELite. (In these plots, SatELite is run on all CNF formulae, while, in Fig. 19, the solver may disable SatELite depending on the size of CNF formula.) OCI contributes to the improved preprocessor speed. This is clear in the case of EVAL versus EVAL + OCI. It is true also without distillation, because EV + OCI removes significantly more clauses and literals than plain EV in about the same time. It is also interesting to compare the reductions achieved by different preprocessors. In Fig. 25, we report the fractions of instances that achieve certain reductions in terms of variables, clauses, and literals. About 10% of the instances achieve close to 100% reduction. This means that preprocessing reduces the CNF formulae to either the empty clause or the empty set of clauses. CirCUs’s variable elimination is less aggressive than SatELite’s: it eliminates fewer clauses, but almost never increases the number of literals. Adding Alembic yields the least number of clauses without compromising the good performance in terms of literals. While conflict analysis during distillation may produce additional conflict clauses, the number of added clauses is on average 0.1% of the total. Alembic often achieves more simplifications thanks to the on-the-fly subsumption check. The mean number of clauses simplified per conflict is 0.7. Moreover, on average, in 51% of the conflicts the 1-UIP clauses subsumes one of the clauses used to resolve it; in those cases, rather than the 1-UIP clause being added to the database, the operand is simplified. |
VIII. ConclusionWe have presented efficient transformations of a CNF formula that aim at either improving its deductive power or shortening implication graphs. We have shown that the transformations help a DPLL-based SAT solver to run faster by deducing more literals from its decisions and by reducing the depth of the implication graphs used in conflict analysis. On-the-fly simplification based on self-subsumption can be applied to any stage that uses resolution, e.g., conflict analysis and variable elimination, with minimal overhead. Its application is compatible with advanced conflict analysis techniques and with the generation of unsatisfiability proofs. Another benefit is the reduction of the number of added conflict clauses without detriment for the deductive power. The distillation procedure applied to preprocessing of the CNF formula also considerably speeds up the SAT solver by increasing deductive power. In contrast, we have shown that variable elimination works mainly by reducing the number of resolution steps required in conflict analysis. This results in earlier conflicts, cheaper analyses and better conflict clauses. The proposed techniques have several other extensions that we plan to investigate: generation of small unsatisfiable cores, application to restarts and solution enumeration, application to non-clausal reasoning, and logic synthesis and representation of sets by characteristic functions in CNF [29]. |
|
References[1] M. Davis and H. Putnam, “A computing procedure for quantification theory,” J. Assoc. Comput. Machinery, vol. 7, no. 3, pp. 201–215, Jul. 1960. [2] M. Davis, G. Logemann, and D. Loveland, “A machine program for theorem proving,” Commun. ACM, vol. 5, no. 7, pp. 394–397, 1962. [3] J. P. Marques-Silva and K. A. Sakallah, “GRASP: A search algorithm for propositional satisfiability,” IEEE Trans. Comput., vol. 48, no. 5, pp. 506–521, May 1999. [4] M. Moskewicz, C. F. Madigan, Y. Zhao, L. Zhang, and S. Malik, “Chaff: Engineering an efficient SAT solver,” in Proc. Design Automat. Conf., Jun. 2001, pp. 530–535. [5] N. Een and N. S ´ orensson, “An extensible SAT-solver,” in ¨ Proc. 6th Int. Conf. SAT, LNCS 2919. May 2003, pp. 502–518. [6] N. Een and A. Biere, “Effective preprocessing in SAT through variable ´ and clause elimination,” in Proc. 8th Int. Conf. SAT, LNCS 3569. Jun. 2005, pp. 61–75. [7] N. Sorensson and N. E ¨ en, “MiniSat v1.13: A SAT solver with conflict- ´ clause minimization,” in Proc. SAT Competition: Solver Description, Jun. 2005. [8] Q. Zhu, N. Kitchen, A. Kuehlmann, and A. Sangiovanni-Vincentelli, “SAT sweeping with local observability don’t cares,” in Proc. Design Automat. Conf., Jul. 2006, pp. 229–234. [9] S. Subbarayan and D. K. Pradhan, “NiVER: Non increasing variable elimination resolution for preprocessing SAT instances,” in Proc. 7th Int. Conf. SAT, LNCS 3542. May 2004, pp. 276–291. [10] H. Jin and F. Somenzi, “Strong conflict analysis for propositional satisfiability,” in Proc. DATE, Mar. 2006, pp. 818–823. [11] H. Han and F. Somenzi, “Alembic: An efficient algorithm for CNF preprocessing,” in Proc. Design Automat. Conf., Jun. 2007, pp. 582– 587. [12] H. Han and F. Somenzi, “On-the-fly clause improvement,” in Proc. 12th Int. Conf.SAT, LNCS 5584. Jun. 2009, pp. 209–222. [13] H. Jin and F. Somenzi, “An incremental algorithm to check satisfiability for bounded model checking,” in Proc. 2nd Int. Workshop Bounded Model Checking, Electronic Notes in Theoretical Computer Science, vol. 119, no. 2, 2004 [Online]. Available: http://www.elsevier.nl/locate/entcs/ [14] A. Nadel, “Understanding and improving a modern SAT solver,” Ph.D. dissertation, School Comput. Sci., Tel Aviv Univ., Tel Aviv, Israel, 2009. [15] L. Zhang, C. Madigan, M. Moskewicz, and S. Malik, “Efficient conflict driven learning in Boolean satisfiability solver,” in Proc. Int. Conf. Comput.-Aided Design, Nov. 2001, pp. 279–285. [16] N. Sorensson and A. Biere, “Minimizing learned clauses,” in ¨ Proc. 12th Int. Conf. SAT, LNCS 5584. Jun. 2009, pp. 237–243. [17] L. Zhang, “On subsumption removal and on-the-fly CNF simplification,” in Proc. 8th Int. Conf. SAT, LNCS 3569. Jun. 2005, pp. 482–489. [18] J. P. Marques-Silva and K. A. Sakallah, “Grasp: A new search algorithm for satisfiability,” in Proc. Int. Conf. Comput.-Aided Design, Nov. 1996, pp. 220–227. [19] The MiniSat Page [Online]. Available: http://minisat.se/MiniSat.html [20] V. C. Vimjam and M. S. Hsiao, “Increasing the deducibility in CNF instances for efficient SAT-based bounded model checking,” in Proc. High-Level Design, Validat. Test Workshop, 2005, pp. 184–191. [21] A. Biere, “PicoSAT essentials,” J. Satisfiabil., Boolean Model. Comput., vol. 4, nos. 2–4, pp. 75–97, 2008. [22] A. V. Aho, J. E. Hopcroft, and J. D. Ullman, Data Structures and Algorithms. Reading, MA: Addison-Wesley, 1983. [23] H. Han, H. Jin, H. Kim, and F. Somenzi, “CirCUs 2.0: SAT competition,” in Proc. SAT Competition: Solver Description, Jun. 2009. [24] VIS: A system for Verification and Synthesis [Online]. Available: http://vlsi.colorado.edu/∼vis [25] E. Goldberg and Y. Novikov, “Verification of proofs of unsatisfiability for CNF formulas,” in Proc. DATE, Mar. 2003, pp. 886–891. [26] L. Zhang and S. Malik, “Validating SAT solvers using an independent resolution-based checker: Practical implementations and other applications,” in Proc. DATE, Mar. 2003, pp. 880–885. [27] The International SAT Competitions [Online]. Available: http://www.satcompetition.org [28] PrecoSAT [Online]. Available: http://fmv.jku.at/precosat [29] K. L. McMillan, “Applying SAT methods in unbounded symbolic model checking,” in Proc. 14th Conf. CAV, LNCS 2404. Jul. 2002, pp. 250– 264. |
|