Spring 2008 Database Seminar

Thursday Mar. 26th, 2008
CSE Room 305
12:00 - 1:00pm


Monte Carlo Methods for Evaluating Probabilistic Relational Predicates
Subramanian Arumugam


In this talk, I will describe statistical issues that arise when a user is allowed to specify database uncertainty via arbitrary, pseudorandom functions that generate the data which populate various "possible worlds". Allowing the user to specify such functions is the most general way to facilitate probabilistic uncertainty, but it creates significant challenges. Specifically, evaluating queries such as "Find all vehicles that are in close proximity to one another with probability 'p' " require the principled use of Monte Carlo statistical methods to determine whether the query predicate holds. In this talk, I will describe very general algorithms that can be used to estimate the probability that a relational predicate evaluates to true over a probabilistic attribute or attributes, where the the attributes are supplied only in the form of a pseudorandom value generator. Furthermore, I will describe techniques to speedup query evaluation using an index structure that stores precomputed samples from the pseudo-random attribute value generator.
For upcoming talks, visit http://www.cise.ufl.edu/dbcenter/seminar.shtml.