Dynamic Maintenance of Data Distribution for Selectivity Estimation.
Kyu-Young Whang, Sang-Wook Kim, Gio Wiederhold:
Dynamic Maintenance of Data Distribution for Selectivity Estimation.
VLDB J. 3(1): 29-51(1994)@article{DBLP:journals/vldb/WhangKW94,
author = {Kyu-Young Whang and
Sang-Wook Kim and
Gio Wiederhold},
title = {Dynamic Maintenance of Data Distribution for Selectivity Estimation},
journal = {VLDB J.},
volume = {3},
number = {1},
year = {1994},
pages = {29-51},
ee = {db/journals/vldb/WhangKW94.html},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
We propose a new dynamic method for multidimensional selectivity estimation
for range queries that works accurately independent of data distribution.
Good estimation of selectivity is important for query optimization and physical
database design. Our method employs the multilevel grid file (MLGF) for
accurate estimation of multidimensional data distribution. The MLGF is a dynamic,
hierarchical, balanced, multidimensional file structure that gracefully adapts to
nonuniform and correlated distributions. We show that the MLGF directory naturally
represents a multidimensional data distribution. We then extent if for further
refinement and present the selectivity estimation method based on the MLGF.
Extensive experiments have been performed to test the accuracy of
selectivity estimation.
The results show that estimation errors are very small independent of
distributions, even with coorelated and/or highly skewed ones.
Finally, we analyze the cause of errors in estimation and investigate the
effects of various parameters on the accuracy of estimation.
Copyright © 1994 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Key Words
Query optimization,
physical database design,
multidimensional file structure,
multilevel grid files.
Online Paper
CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ...
DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...
References
- [Astrahan et al. 1976]
- Morton M. Astrahan, Mike W. Blasgen, Donald D. Chamberlin, Kapali P. Eswaran, Jim Gray, Patricia P. Griffiths, W. Frank King III, Raymond A. Lorie, Paul R. McJones, James W. Mehl, Gianfranco R. Putzolu, Irving L. Traiger, Bradford W. Wade, Vera Watson:
System R: Relational Approach to Database Management.
ACM Trans. Database Syst. 1(2): 97-137(1976)

- [Chen et al. 1990]
- Meng Chang Chen, Lawrence McNamee, Norman S. Matloff:
Selectivity Estimation Using Homogeneity Measurement.
ICDE 1990: 304-310

- [Christodoulakis 1083]
- Stavros Christodoulakis:
Estimating record selectivities.
Inf. Syst. 8(2): 105-115(1983)

- [Fagin et al. 1979]
- Ronald Fagin, Jürg Nievergelt, Nicholas Pippenger, H. Raymond Strong:
Extendible Hashing - A Fast Access Method for Dynamic Files.
ACM Trans. Database Syst. 4(3): 315-344(1979)

- [Freeston 1987]
- Michael Freeston:
The BANG File: A New Kind of Grid File.
SIGMOD Conference 1987: 260-269

- [Ioannidis & Christodoulakis 1991]
- Yannis E. Ioannidis, Stavros Christodoulakis:
On the Propagation of Errors in the Size of Join Results.
SIGMOD Conference 1991: 268-277

- [Mannino et al. 1988]
- Michael V. Mannino, Paicheng Chu, Thomas Sager:
Statistical Profile Estimation in Database Systems.
ACM Comput. Surv. 20(3): 191-221(1988)

- [Muralikrishna & DeWitt 1988]
- M. Muralikrishna, David J. DeWitt:
Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries.
SIGMOD Conference 1988: 28-36

- [Muthuswamy & Kershberg 1985]
- B. Muthuswamy, Larry Kerschberg:
A Detailed Database Statistics Model for Realtional Query Optimization.
ACM Annual Conference - The range of computing: mid-80's perspective 1985: 439-448

- [Nievergelt et al. 1984]
- Jürg Nievergelt, Hans Hinterberger, Kenneth C. Sevcik:
The Grid File: An Adaptable, Symmetric Multikey File Structure.
ACM Trans. Database Syst. 9(1): 38-71(1984)

- [Otoo 1984]
- Ekow J. Otoo:
A Mapping Function for the Directory of a Multidimensional Extendible Hashing.
VLDB 1984: 493-506

- [Otoo 1986]
- Ekow J. Otoo:
Balanced Multidimensional Extendible Hash Tree.
PODS 1986: 100-113

- [Piatetsky & Connell 1984]
- Gregory Piatetsky-Shapiro, Charles Connell:
Accurate Estimation of the Number of Tuples Satisfying a Condition.
SIGMOD Conference 1984: 256-276

- [Robinson 1981]
- John T. Robinson:
The K-D-B-Tree: A Search Structure For Large Multidimensional Dynamic Indexes.
SIGMOD Conference 1981: 10-18

- [Robinson 1986]
- John T. Robinson:
Order Preserving Linear Hashing Using Dynamic Key Statistics.
PODS 1986: 91-99

- [Selinger et al. 1979]
- Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price:
Access Path Selection in a Relational Database Management System.
SIGMOD Conference 1979: 23-34

- [Selinger 1991]
- ...
- [Vander Zander et al. 1986]
- Brad T. Vander Zanden, Howard M. Taylor, Dina Bitton:
Estimating Block Accessses when Attributes are Correlated.
VLDB 1986: 119-127

- [Whang & Krihsnamurthy 1985]
- ...
- [Whang & Krishnamurthy 1990]
- Kyu-Young Whang, Ravi Krishnamurthy:
Query Optimization in a Memory-Resident Domain Relational Calculus Database System.
ACM Trans. Database Syst. 15(1): 67-95(1990)

- [Whang et al. 1990]
- Kyu-Young Whang, Brad T. Vander Zanden, Howard M. Taylor:
A Linear-Time Probabilistic Counting Algorithm for Database Applications.
ACM Trans. Database Syst. 15(2): 208-229(1990)

- [Whang & Krishnamurthy 1991]
- Kyu-Young Whang, Ravi Krishnamurthy:
The Multilevel Grid File - A Dynamic Hierarchical Multidimensional File Structure.
DASFAA 1991: 449-459

- [Wiederhold 1983]
- ...
- [Wiederhold 1987]
- Gio Wiederhold:
File Organisation for Database Design.
McGraw-Hill Book Company 1987, ISBN 0-07-100340-1

Copyright © Sun Nov 15 06:08:42 2009
by Michael Ley (ley@uni-trier.de)