Graph-based Algorithms for Pareto Preference Query Evaluation

Timotheus Preisinger

Dissertation, University of Augsburg.
1st Examiner: Professor Dr. W. Kießling
2nd Examiner: Professor Dr. B. Möller
erschienen 07/2009


Searching a database is one of the most common procedures in everyday life. Usually, the results of such a search match the query parameters perfectly. But if no perfect match is found, the user usually has to find out by himself how to change search parameters in order to get results.

To overcome this problem, Kießling has introduced a model of preferences in databases. This model is based on simple strict partial orders as given in expressions like “red is better than blue”. For every query, the best-matching objects are returned, whether these are perfect matches or not. A best match is a tuple that matches the preference not worse than any other tuple – or as we say – that is not dominated by any other tuple. The specific problem we address is finding best matches for Pareto preferences, the combination of preferences with all of them being equally important. This problem is closely related to skyline queries.

Based on the better-than graph, a visualization of the strict partial orders constructed by Pareto preferences, we have found a novel type of optimization called pruning that can be applied to all existing generic algorithms. While common generic algorithms rely on tuple-to-tuple comparisons to identify dominated tuples, our optimization technique uses the structure of the better-than graph to identify elements in the order that are definitively dominated by some given tuple. This enables us to omit many comparisons.

By further analysis of the better-than graph, we were able to find a new kind of algorithm. This generic algorithm, Hexagon, is capable of finding the best matches in some previously unknown set of tuples in linear time with respect to the size of the better-than graph. Apart from the standard algorithm, we present a number of optimizations for it regarding its memory requirements. But Hexagon is not limited to standard preference queries. We also address top-k queries with a variant of Hexagon. These queries return the best k tuples of an input relation with respect to some rating function.