The proof/plan correspondence in databases
We consider answering queries over datasources (for logicians: finite relational structures) that are connected to one another by integrity constraints (logical relationships). A relation in a datasource may have multiple means of access with different required arguments, and the means of access within and across relations may differ radically in cost. We present a method for generating the lowest cost query plan, where a plan is a reformulation of the query referencing only the allowable access methods, equivalent to the original query on all datasources satisfying the integrity constraints. A solution to this problem can be applied in many database contexts, including answering queries using views, optimizing queries using constraints, and querying on top of web services or web forms. It could also be applied to give a more logic-based foundation for traditional database query planning.
Proof-driven query reformulation generates plans by applying interpolation procedures to a “proof that the query can be answered,” an approach that goes back at least to Craig’s proof of the Beth Definability theorem via interpolation. A few additional ingredients are needed to apply interpolation to database problems, including:
— variations of Beth Definability that allow us to target plans (explicit definitions) of specific forms
— an extension of interpolation that considers the “access patterns” within quantification
— specialized proof procedures for classes of integrity constraints of interest to databases
— cost-based search procedures for concurrently exploring the collection of proofs and plans
The resulting query planning system blends automated deduction, query optimization, and heuristic search.
This is joint work with Balder ten Cate, Julien Leblay and Efi Tsamoura.