Accurate Estimation of the Number of Tuples Satisfying a Condition.
Gregory Piatetsky-Shapiro, Charles Connell:
Accurate Estimation of the Number of Tuples Satisfying a Condition.
SIGMOD Conference 1984: 256-276@inproceedings{DBLP:conf/sigmod/Piatetsky-ShapiroC84,
author = {Gregory Piatetsky-Shapiro and
Charles Connell},
editor = {Beatrice Yormark},
title = {Accurate Estimation of the Number of Tuples Satisfying a Condition},
booktitle = {SIGMOD'84, Proceedings of Annual Meeting, Boston, Massachusetts,
June 18-21, 1984},
publisher = {ACM Press},
year = {1984},
pages = {256-276},
ee = {http://doi.acm.org/10.1145/602259.602294, db/conf/sigmod/Piatetsky-ShapiroC84.html},
crossref = {DBLP:conf/sigmod/84},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
We present a new method for estimating the number of tuples satisfying a
condition of the type attribute rel constant, where rel is one
of "=", ">", "<", ">=", "<=". Our method gives highly accurate,
yet easy to compute, estimates. We store information about attribute values as
a list of distribution steps (histograms where buckets, instead of
having equal width, have equal height). These distribution steps provide an
upper bound on the error when estimating the number of tuples satisfying a
condition. The estimation error can be arbitrarily reduced by increasing the
number of steps. We analyze desirable conditions that such estimates should
satisfy. Based on the distribution steps, we derive a set of estimation
formulas which minimize the worst-case error. We also present another set of
formulas which reduce the average-case error. Finally, we show how to use
sampling to compute a close approximation of the distribution steps very
quickly. The major applications of our method are in query optimization
and in answering statistical queries.
Copyright © 1984 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
Online Version (ACM WWW Account required): Full Text in PDF Format
CDROM Version: Load the CDROM "Volume 1 Issue 2, SIGMOD '75-'92" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
Beatrice Yormark (Ed.):
SIGMOD'84, Proceedings of Annual Meeting, Boston, Massachusetts, June 18-21, 1984.
ACM Press 1984
,
SIGMOD Record 14(2)
Contents
References
- [Blasgen 77]
- Mike W. Blasgen, Kapali P. Eswaran:
Storage and Access in Relational Data Bases.
IBM Systems Journal 16(4): 362-377(1977)

- [Christodoulakis 81]
- ...
- [Christodoulakis 83]
- Stavros Christodoulakis:
Estimating Block Transfers and Join Sizes.
SIGMOD Conference 1983: 40-54

- [Demolombe 80]
- Robert Demolombe:
Estimation of the Number of Tuples Satisfying a Query Expressed in Predicate Calculus Language.
VLDB 1980: 55-63

- [Dixon 79]
- ...
- [Luk 83]
- W. S. Luk:
On Estimating Block Accesses in Database Organizations.
Commun. ACM 26(11): 945-947(1983)

- [Piatetsky 84]
- ...
- [Rowe 83]
- Neil C. Rowe:
Top-Down Statistical Estimation on a Database.
SIGMOD Conference 1983: 135-145

- [Samson 83]
- W. B. Samson, A. Bendell:
Rank Order Distributions and Secondary Key Indexing.
Comput. J. 28(3): 309-312(1985)

- [Selinger 79]
- Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price:
Access Path Selection in a Relational Database Management System.
SIGMOD Conference 1979: 23-34

- [Whang 83]
- Kyu-Young Whang, Gio Wiederhold, Daniel Sagalowicz:
Estimating Block Accesses in Database Organizations: A Closed Noniterative Formula.
Commun. ACM 26(11): 940-944(1983)

- [Yao 77]
- S. Bing Yao:
Approximating the Number of Accesses in Database Organizations.
Commun. ACM 20(4): 260-261(1977)

- [Yao 79]
- S. Bing Yao:
Optimization of Query Evaluation Algorithms.
ACM Trans. Database Syst. 4(2): 133-155(1979)

- [Yu 78]
- Clement T. Yu, W. S. Luk, M. K. Siu:
On the Estimation of the Number of Desired Records with Respect to a Given Query.
ACM Trans. Database Syst. 3(1): 41-56(1978)

- [Zipf 49]
- George Kingsley Zipf:
Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology.
Addison-Wesley 1949

Copyright © Wed Dec 23 21:45:11 2009
by Michael Ley (ley@uni-trier.de)