Brief Tutorial on Probabilistic Databases
Calvin Lab Auditorium
Probabilistic Databases (PDBs) extend traditional relational databases by annotating each record with a weight, or a probability. Although PDBs define a very simple probability space, by adding constraints one can represent rich models such as Markov Logic Networks. While in traditional databases query evaluation corresponds to model checking and is always tractable, in probabilistic databases it becomes model counting, a computationally hard problem. Research on probabilistic databases has lead to new approaches to weighted model counting that exploit the structure of the query expression, similar to lifted inference in statistical relational models. For restricted fragments of First Order Logic, a dichotomy theorem holds: for any query in that fragment, weighted model counting is either in polynomial time, or is provably #P-hard, and the complexity can be determined by analyzing the query. I will also describe absolute lower bounds on the runtime of DPLL-based model counting algorithms, which rely on compiling the query into a Decision Diagram, then proving exponential size lower bounds on the size of the Decision Diagram.
Attachment | Size |
---|---|
Brief Tutorial on Probabilistic Databases | 6.17 MB |