Statistical Profile Estimation in Database Systems.
Michael V. Mannino, Paicheng Chu, Thomas Sager:
Statistical Profile Estimation in Database Systems.
ACM Comput. Surv. 20(3): 191-221(1988)@article{DBLP:journals/csur/ManninoCS88,
author = {Michael V. Mannino and
Paicheng Chu and
Thomas Sager},
title = {Statistical Profile Estimation in Database Systems},
journal = {ACM Comput. Surv.},
volume = {20},
number = {3},
year = {1988},
pages = {191-221},
ee = {db/journals/csur/ManninoCS88.html},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
A statistical profile summarizes the instances of a database. It describes aspects such as
the number of tuples, the number of values, the distribution of values, the correlation
between value sets, and the distribution of tuples among secondary storage units.
Estimation of database profiles is critical in the problems of query optimization, physical
database design, and database performance prediction. This paper describes a model of a
database of profile, relates this model to estimating the cost of database operations, and
surveys methods of estimating profiles. The operators and objects in the model include
build profile, estimate profile, and update profile. The estimate operator is classified by
the relational algebra operator (select, project, join), the property to be estimated
(cardinality, distribution of values, and other parameters), and the underlying method
(parametric, nonparametric, and ad-hoc). The accuracy, overhead, and assumptions of
methods are discussed in detail. Relevant research in both the database and the statistics
disciplines is incorporated in the detailed discussion.
Copyright © 1988 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ...
DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...
Citation Page
References
- [Apers et al. 1983]
- Peter M. G. Apers, Alan R. Hevner, S. Bing Yao:
Optimization Algorithms for Distributed Queries.
IEEE Trans. Software Eng. 9(1): 57-68(1983)

- [Astrahan et al. 1985]
- ...
- [Batory and Mannino 1986]
- Don S. Batory, Michael V. Mannino:
Panel on Extensible Database Systems.
SIGMOD Conference 1986: 187-190

- [Bernstein et al. 1981]
- Philip A. Bernstein, Nathan Goodman, Eugene Wong, Christopher L. Reeve, James B. Rothnie Jr.:
Query Processing in a System for Distributed Databases (SDD-1).
ACM Trans. Database Syst. 6(4): 602-625(1981)

- [Breiman e tal. 1977]
- ...
- [Cacoullos 1966]
- ...
- [Cardenas 1975]
- Alfonso F. Cardenas:
Analysis and Performance of Inverted Data Base Structures.
Commun. ACM 18(5): 253-263(1975)

- [Ceri and Pelagatti 1984]
- Stefano Ceri, Giuseppe Pelagatti:
Distributed Databases: Principles and Systems.
McGraw-Hill Book Company 1984, ISBN 0-07-010829-3

- [Chamberlin et al. 1981]
- Donald D. Chamberlin, Morton M. Astrahan, W. Frank King III, Raymond A. Lorie, James W. Mehl, Thomas G. Price, Mario Schkolnick, Patricia G. Selinger, Donald R. Slutz, Bradford W. Wade, Robert A. Yost:
Support for Repetitive Transactions and Ad Hoc Queries in System R.
ACM Trans. Database Syst. 6(1): 70-94(1981)

- [Cheung 1982]
- To-Yat Cheung:
Estimating Block Accesses and Number of Recorde in File Management.
Commun. ACM 25(7): 484-487(1982)

- [Christodoulakis 1981]
- ...
- [Christodoulakis 1983a]
- Stavros Christodoulakis:
Estimating record selectivities.
Inf. Syst. 8(2): 105-115(1983)

- [Christodoulakis 1983b]
- Stavros Christodoulakis:
Estimating Block Transfers and Join Sizes.
SIGMOD Conference 1983: 40-54

- [Christodoulakis 1984a]
- Stavros Christodoulakis:
Estimating Block Selectivities.
Inf. Syst. 9(1): 69-79,(1984)

- [Christodoulakis 1984b]
- Stavros Christodoulakis:
Implications of Certain Assumptions in Database Performance Evaluation.
ACM Trans. Database Syst. 9(2): 163-186(1984)

- [Clark and Schkade 1983 ]
- ...
- [Copeland et al. 1986]
- George P. Copeland, Setrag Khoshafian, Marc G. Smith, Patrick Valduriez:
Buffering Schemes for Permanent Data.
ICDE 1986: 214-221

- [Devroye 1985]
- ...
- [DeWitt et al. 1984]
- David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood:
Implementation Techniques for Main Memory Database Systems.
SIGMOD Conference 1984: 1-8

- [Fedorowicz 1984]
- Jane Fedorowicz:
Database Evaluation Using Multiple Regression Techniques.
SIGMOD Conference 1984: 70-76

- [Fedorowicz 1987]
- Jane Fedorowicz:
Database Performance Evaluation in an Indexed File Environment.
ACM Trans. Database Syst. 12(1): 85-110(1987)

- [Finkelstein et al. 1988]
- Sheldon J. Finkelstein, Mario Schkolnick, Paolo Tiberio:
Physical Database Design for Relational Databases.
ACM Trans. Database Syst. 13(1): 91-128(1988)

- [Fraser 1957]
- ...
- [Gelenbe and Gardy 1982]
- Erol Gelenbe, Danièle Gardy:
The Size of Projections of Relations Satisfying a Functional Dependency.
VLDB 1982: 325-333

- [Hevner and Yao 1979]
- Alan R. Hevner, S. Bing Yao:
Query Processing in Distributed Database Systems.
IEEE Trans. Software Eng. 5(3): 177-187(1979)

- [Ijbema and Blanken 1986]
- Alle IJbema, Henk M. Blanken:
Estimating Bucket Accesses: A Practical Approach.
ICDE 1986: 30-37

- [Jarke and Koch 1984]
- Matthias Jarke, Jürgen Koch:
Query Optimization in Database Systems.
ACM Comput. Surv. 16(2): 111-152(1984)

- [Johnson and Kotz 1970]
- ...
- [Kamel and King 1985]
- Nabil Kamel, Roger King:
A Model of Data Distribution Based on Texture Analysis.
SIGMOD Conference 1985: 319-325

- [Kerschberg et al. 1982]
- Larry Kerschberg, Peter D. Ting, S. Bing Yao:
Query Optimization in Star Computer Networks.
ACM Trans. Database Syst. 7(4): 678-711(1982)

- [Kooi 1980]
- Robert Kooi:
The Optimization of Queries in Relational Databases.
Ph.D. thesis, Case Western Reserve University 1980

- [Kotz and Johnson 1977]
- ...
- [Kumar and Stonebraker 1987]
- Akhil Kumar, Michael Stonebraker:
The Effect of Join Selectivities on Optimal Nesting Order.
SIGMOD Record 16(1): 28-41(1987)

- [Lohman et al. 1985]
- Guy M. Lohman, C. Mohan, Laura M. Haas, Dean Daniels, Bruce G. Lindsay, Patricia G. Selinger, Paul F. Wilms:
Query Processing in R*.
Query Processing in Database Systems 1985: 31-47

- [Luk 1983]
- W. S. Luk:
On Estimating Block Accesses in Database Organizations.
Commun. ACM 26(11): 945-947(1983)

- [Mackert and Lohman 1985]
- Lothar F. Mackert, Guy M. Lohman:
Index Scans Using a Finite LRU Buffer: A Validated I/O Model.
ACM Trans. Database Syst. 14(3): 401-424(1989)

- [Mackert and Lohman 1986a]
- Lothar F. Mackert, Guy M. Lohman:
R* Optimizer Validation and Performance Evaluation for Local Queries.
SIGMOD Conference 1986: 84-95

- [Mackert and Lohman 1986b]
- Lothar F. Mackert, Guy M. Lohman:
R* Optimizer Validation and Performance Evaluation for Distributed Queries.
VLDB 1986: 149-159

- [Mahalanobis 1936]
- ...
- [Mannino 1986]
- ...
- [Mannino and Rivera 1988]
- ...
- [Merrett and Otoo 1979]
- T. H. Merrett, Ekow J. Otoo:
Distribution Models of Relations.
VLDB 1979: 418-425

- [Montgomery et al. 1983]
- Anthony Y. Montgomery, Daryl J. D'Souza, S. B. Lee:
The Cost of Relational Algebraic Operations on Skewed Data: Estimates and Experiments.
IFIP Congress 1983: 235-241

- [Moore and Yackel 1977]
- ...
- [Muralikrishna and Dewitt 1988]
- M. Muralikrishna, David J. DeWitt:
Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries.
SIGMOD Conference 1988: 28-36

- [Muthuswamy and Kerschberg 1985]
- ...
- [Piatetsky-Shapiro and Connell 1984]
- Gregory Piatetsky-Shapiro, Charles Connell:
Accurate Estimation of the Number of Tuples Satisfying a Condition.
SIGMOD Conference 1984: 256-276

- [Piatetsky-Shapiro 1985]
- ...
- [Rowe 1985]
- Neil C. Rowe:
Antisampling for Estimation: An Overview.
IEEE Trans. Software Eng. 11(10): 1081-1091(1985)

- [Samson and Bendell 1983]
- W. B. Samson, A. Bendell:
Rank Order Distributions and Secondary Key Indexing.
Comput. J. 28(3): 309-312(1985)

- [Selinger et al. 1979]
- Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price:
Access Path Selection in a Relational Database Management System.
SIGMOD Conference 1979: 23-34

- [Schkolnick and Tiberio 1979]
- ...
- [Stonebraker et al. 1983]
- Michael Stonebraker, W. Bradley Rubenstein, Antonin Guttman:
Application of Abstract Data Types and Abstract Indices to CAD Data Bases.
Engineering Design Applications 1983: 107-113

- [Sturges 1926]
- ...
- [Tapia and Thompson 1978]
- ...
- [Teorey and Fry 1982]
- Toby J. Teorey, James P. Fry:
Design of Database Structures.
Prentice-Hall 1982

- [Unify Corporation 1985]
- ...
- [Vander Zander etal. 1986]
- Brad T. Vander Zanden, Howard M. Taylor, Dina Bitton:
Estimating Block Accessses when Attributes are Correlated.
VLDB 1986: 119-127

- [Wegman 1983]
- ...
- [Wertz 1978]
- ...
- [Whang et al. 1983]
- Kyu-Young Whang, Gio Wiederhold, Daniel Sagalowicz:
Estimating Block Accesses in Database Organizations: A Closed Noniterative Formula.
Commun. ACM 26(11): 940-944(1983)

- [Yao 1977]
- S. Bing Yao:
Approximating the Number of Accesses in Database Organizations.
Commun. ACM 20(4): 260-261(1977)

- [Zahorjan et al. 1983]
- John Zahorjan, Barbara J. Bell, Kenneth C. Sevcik:
Estimating Block Transfers When Record Access Probabilities are Non-Uniform.
Inf. Process. Lett. 16(5): 249-252(1983)

- [Zipf 1949]
- George Kingsley Zipf:
Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology.
Addison-Wesley 1949

Copyright © Tue Dec 1 16:30:23 2009
by Michael Ley (ley@uni-trier.de)