Skyline Cardinality for Relational Processing

Capítulo de livro Revisado por pares

Skyline Cardinality for Relational Processing

2004; Springer Science+Business Media; Linguagem: Inglês

10.1007/978-3-540-24627-5_7

ISSN

1611-3349

Autores

Parke Godfrey,

Tópico(s)

Constraint Satisfaction and Optimization

Resumo

The skyline clause—also called the Pareto clause—recently has been proposed as an extension to SQL. It selects the tuples that are Pareto optimal with respect to a set of designated skyline attributes. This is the maximal vector problem in a relational context, but it represents a powerful extension to SQL which allows for the natural expression of on-line analytic processing (OLAP) queries and preferences in queries. Cardinality estimation of skyline sets is the focus in this work. A better understanding of skyline cardinality—and other properties of the skyline—is useful for better design of skyline algorithms, is necessary to extend a query optimizer's cost model to accommodate skyline queries, and helps to understand better how to use skyline effectively for OLAP and preference queries. Within a basic model with assumptions of sparseness of values on attributes' domains and statistical independence across attributes, we establish the expected skyline cardinality for skyline queries. While asymptotic bounds have been previously established, they are not widely known nor applied in skyline work. We show concrete estimates, as would be needed in a cost model, and consider the nature of the distribution of skyline. We next establish the effects on skyline cardinality as the constraints on our basic model are relaxed. Some of the results are quite counter-intuitive, and understanding these is critical to skyline's use in OLAP and preference queries. We consider when attributes' values repeat on their domains, and show the number of skyline is diminished. We consider the effects of having Zipfian distributions on the attributes' domains, and generalize the expectation for other distributions. Last, we consider the ramifications of correlation across the attributes.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Skyline Cardinality for Relational Processing