ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Statistical Profile Estimation in Database Systems.

Michael V. Mannino, Paicheng Chu, Thomas Sager: Statistical Profile Estimation in Database Systems. ACM Comput. Surv. 20(3): 191-221(1988)
@article{DBLP:journals/csur/ManninoCS88,
  author    = {Michael V. Mannino and
               Paicheng Chu and
               Thomas Sager},
  title     = {Statistical Profile Estimation in Database Systems},
  journal   = {ACM Comput. Surv.},
  volume    = {20},
  number    = {3},
  year      = {1988},
  pages     = {191-221},
  ee        = {db/journals/csur/ManninoCS88.html},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

A statistical profile summarizes the instances of a database. It describes aspects such as the number of tuples, the number of values, the distribution of values, the correlation between value sets, and the distribution of tuples among secondary storage units. Estimation of database profiles is critical in the problems of query optimization, physical database design, and database performance prediction. This paper describes a model of a database of profile, relates this model to estimating the cost of database operations, and surveys methods of estimating profiles. The operators and objects in the model include build profile, estimate profile, and update profile. The estimate operator is classified by the relational algebra operator (select, project, join), the property to be estimated (cardinality, distribution of values, and other parameters), and the underlying method (parametric, nonparametric, and ad-hoc). The accuracy, overhead, and assumptions of methods are discussed in detail. Relevant research in both the database and the statistics disciplines is incorporated in the detailed discussion.

Copyright © 1988 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 4 Issue 1, Books, VLDB-j, TODS, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 2" and ...

Online Edition: ACM Digital Library

Citation Page

References

[Apers et al. 1983]
Peter M. G. Apers, Alan R. Hevner, S. Bing Yao: Optimization Algorithms for Distributed Queries. IEEE Trans. Software Eng. 9(1): 57-68(1983) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Astrahan et al. 1985]
...
[Batory and Mannino 1986]
Don S. Batory, Michael V. Mannino: Panel on Extensible Database Systems. SIGMOD Conference 1986: 187-190 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Bernstein et al. 1981]
Philip A. Bernstein, Nathan Goodman, Eugene Wong, Christopher L. Reeve, James B. Rothnie Jr.: Query Processing in a System for Distributed Databases (SDD-1). ACM Trans. Database Syst. 6(4): 602-625(1981) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Breiman e tal. 1977]
...
[Cacoullos 1966]
...
[Cardenas 1975]
Alfonso F. Cardenas: Analysis and Performance of Inverted Data Base Structures. Commun. ACM 18(5): 253-263(1975) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Ceri and Pelagatti 1984]
Stefano Ceri, Giuseppe Pelagatti: Distributed Databases: Principles and Systems. McGraw-Hill Book Company 1984, ISBN 0-07-010829-3
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Chamberlin et al. 1981]
Donald D. Chamberlin, Morton M. Astrahan, W. Frank King III, Raymond A. Lorie, James W. Mehl, Thomas G. Price, Mario Schkolnick, Patricia G. Selinger, Donald R. Slutz, Bradford W. Wade, Robert A. Yost: Support for Repetitive Transactions and Ad Hoc Queries in System R. ACM Trans. Database Syst. 6(1): 70-94(1981) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Cheung 1982]
To-Yat Cheung: Estimating Block Accesses and Number of Recorde in File Management. Commun. ACM 25(7): 484-487(1982) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Christodoulakis 1981]
...
[Christodoulakis 1983a]
Stavros Christodoulakis: Estimating record selectivities. Inf. Syst. 8(2): 105-115(1983) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Christodoulakis 1983b]
Stavros Christodoulakis: Estimating Block Transfers and Join Sizes. SIGMOD Conference 1983: 40-54 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Christodoulakis 1984a]
Stavros Christodoulakis: Estimating Block Selectivities. Inf. Syst. 9(1): 69-79,(1984) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Christodoulakis 1984b]
Stavros Christodoulakis: Implications of Certain Assumptions in Database Performance Evaluation. ACM Trans. Database Syst. 9(2): 163-186(1984) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Clark and Schkade 1983 ]
...
[Copeland et al. 1986]
George P. Copeland, Setrag Khoshafian, Marc G. Smith, Patrick Valduriez: Buffering Schemes for Permanent Data. ICDE 1986: 214-221 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Devroye 1985]
...
[DeWitt et al. 1984]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Fedorowicz 1984]
Jane Fedorowicz: Database Evaluation Using Multiple Regression Techniques. SIGMOD Conference 1984: 70-76 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Fedorowicz 1987]
Jane Fedorowicz: Database Performance Evaluation in an Indexed File Environment. ACM Trans. Database Syst. 12(1): 85-110(1987) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Finkelstein et al. 1988]
Sheldon J. Finkelstein, Mario Schkolnick, Paolo Tiberio: Physical Database Design for Relational Databases. ACM Trans. Database Syst. 13(1): 91-128(1988) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Fraser 1957]
...
[Gelenbe and Gardy 1982]
Erol Gelenbe, Danièle Gardy: The Size of Projections of Relations Satisfying a Functional Dependency. VLDB 1982: 325-333 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Hevner and Yao 1979]
Alan R. Hevner, S. Bing Yao: Query Processing in Distributed Database Systems. IEEE Trans. Software Eng. 5(3): 177-187(1979) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Ijbema and Blanken 1986]
Alle IJbema, Henk M. Blanken: Estimating Bucket Accesses: A Practical Approach. ICDE 1986: 30-37 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Jarke and Koch 1984]
Matthias Jarke, Jürgen Koch: Query Optimization in Database Systems. ACM Comput. Surv. 16(2): 111-152(1984) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Johnson and Kotz 1970]
...
[Kamel and King 1985]
Nabil Kamel, Roger King: A Model of Data Distribution Based on Texture Analysis. SIGMOD Conference 1985: 319-325 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Kerschberg et al. 1982]
Larry Kerschberg, Peter D. Ting, S. Bing Yao: Query Optimization in Star Computer Networks. ACM Trans. Database Syst. 7(4): 678-711(1982) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Kooi 1980]
Robert Kooi: The Optimization of Queries in Relational Databases. Ph.D. thesis, Case Western Reserve University 1980
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Kotz and Johnson 1977]
...
[Kumar and Stonebraker 1987]
Akhil Kumar, Michael Stonebraker: The Effect of Join Selectivities on Optimal Nesting Order. SIGMOD Record 16(1): 28-41(1987) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Lohman et al. 1985]
Guy M. Lohman, C. Mohan, Laura M. Haas, Dean Daniels, Bruce G. Lindsay, Patricia G. Selinger, Paul F. Wilms: Query Processing in R*. Query Processing in Database Systems 1985: 31-47 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Luk 1983]
W. S. Luk: On Estimating Block Accesses in Database Organizations. Commun. ACM 26(11): 945-947(1983) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Mackert and Lohman 1985]
Lothar F. Mackert, Guy M. Lohman: Index Scans Using a Finite LRU Buffer: A Validated I/O Model. ACM Trans. Database Syst. 14(3): 401-424(1989) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Mackert and Lohman 1986a]
Lothar F. Mackert, Guy M. Lohman: R* Optimizer Validation and Performance Evaluation for Local Queries. SIGMOD Conference 1986: 84-95 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Mackert and Lohman 1986b]
Lothar F. Mackert, Guy M. Lohman: R* Optimizer Validation and Performance Evaluation for Distributed Queries. VLDB 1986: 149-159 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Mahalanobis 1936]
...
[Mannino 1986]
...
[Mannino and Rivera 1988]
...
[Merrett and Otoo 1979]
T. H. Merrett, Ekow J. Otoo: Distribution Models of Relations. VLDB 1979: 418-425 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Montgomery et al. 1983]
Anthony Y. Montgomery, Daryl J. D'Souza, S. B. Lee: The Cost of Relational Algebraic Operations on Skewed Data: Estimates and Experiments. IFIP Congress 1983: 235-241 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Moore and Yackel 1977]
...
[Muralikrishna and Dewitt 1988]
M. Muralikrishna, David J. DeWitt: Equi-Depth Histograms For Estimating Selectivity Factors For Multi-Dimensional Queries. SIGMOD Conference 1988: 28-36 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Muthuswamy and Kerschberg 1985]
...
[Piatetsky-Shapiro and Connell 1984]
Gregory Piatetsky-Shapiro, Charles Connell: Accurate Estimation of the Number of Tuples Satisfying a Condition. SIGMOD Conference 1984: 256-276 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Piatetsky-Shapiro 1985]
...
[Rowe 1985]
Neil C. Rowe: Antisampling for Estimation: An Overview. IEEE Trans. Software Eng. 11(10): 1081-1091(1985) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Samson and Bendell 1983]
W. B. Samson, A. Bendell: Rank Order Distributions and Secondary Key Indexing. Comput. J. 28(3): 309-312(1985) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Selinger et al. 1979]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Schkolnick and Tiberio 1979]
...
[Stonebraker et al. 1983]
Michael Stonebraker, W. Bradley Rubenstein, Antonin Guttman: Application of Abstract Data Types and Abstract Indices to CAD Data Bases. Engineering Design Applications 1983: 107-113 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Sturges 1926]
...
[Tapia and Thompson 1978]
...
[Teorey and Fry 1982]
Toby J. Teorey, James P. Fry: Design of Database Structures. Prentice-Hall 1982
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Unify Corporation 1985]
...
[Vander Zander etal. 1986]
Brad T. Vander Zanden, Howard M. Taylor, Dina Bitton: Estimating Block Accessses when Attributes are Correlated. VLDB 1986: 119-127 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Wegman 1983]
...
[Wertz 1978]
...
[Whang et al. 1983]
Kyu-Young Whang, Gio Wiederhold, Daniel Sagalowicz: Estimating Block Accesses in Database Organizations: A Closed Noniterative Formula. Commun. ACM 26(11): 940-944(1983) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Yao 1977]
S. Bing Yao: Approximating the Number of Accesses in Database Organizations. Commun. ACM 20(4): 260-261(1977) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Zahorjan et al. 1983]
John Zahorjan, Barbara J. Bell, Kenneth C. Sevcik: Estimating Block Transfers When Record Access Probabilities are Non-Uniform. Inf. Process. Lett. 16(5): 249-252(1983) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Zipf 1949]
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Tue Dec 1 16:30:23 2009 by Michael Ley (ley@uni-trier.de)