New Sampling-Based Summary Statistics for Improving Approximate Query Answers.
Phillip B. Gibbons, Yossi Matias:
New Sampling-Based Summary Statistics for Improving Approximate Query Answers.
SIGMOD Conference 1998: 331-342@inproceedings{DBLP:conf/sigmod/GibbonsM98,
author = {Phillip B. Gibbons and
Yossi Matias},
editor = {Laura M. Haas and
Ashutosh Tiwary},
title = {New Sampling-Based Summary Statistics for Improving Approximate
Query Answers},
booktitle = {SIGMOD 1998, Proceedings ACM SIGMOD International Conference
on Management of Data, June 2-4, 1998, Seattle, Washington, USA},
publisher = {ACM Press},
year = {1998},
isbn = {0-89791-995-5},
pages = {331-342},
ee = {http://doi.acm.org/10.1145/276304.276334},
crossref = {DBLP:conf/sigmod/98},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
In large data recording and warehousing environments, it is
often advantageous to provide fast, approximate answers to
queries, whenever possible. Before DBMSs providing
highly-accurate approximate answers can become a reality, many
new techniques for summarizing data and for estimating
answers from summarized data must be developed. This paper
introduces two new sampling-based summary statistics,
concise samples and counting samples, and presents new
techniques for their fast incremental maintenance regardless
of the data distribution. We quantify their advantages over
standard sample views in terms of the number of additional
sample points for the same view size, and hence in providing
more accurate query answers. Finally, we consider their
application to providing fast approximate answers to hot list
queries. Our algorithms maintain their accuracy in the
presence of ongoing insertions to the data warehouse.
Copyright © 1998 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ...
Online Version (ACM WWW Account required): Full Text in PDF Format
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
Laura M. Haas, Ashutosh Tiwary (Eds.):
SIGMOD 1998, Proceedings ACM SIGMOD International Conference on Management of Data, June 2-4, 1998, Seattle, Washington, USA.
ACM Press 1998, ISBN 0-89791-995-5
, SIGMOD Record 27(2), June 1998
Contents
[Abstract]
[Full Text (Postscript)]
References
- [AMS96]
- ...
- [Ant92]
- Gennady Antoshenkov:
Random Sampling from Pseudo-Ranked B+ Trees.
VLDB 1992: 375-382

- [AS94]
- Rakesh Agrawal, Ramakrishnan Srikant:
Fast Algorithms for Mining Association Rules in Large Databases.
VLDB 1994: 487-499

- [AZ96]
- Gennady Antoshenkov, Mohamed Ziauddin:
Query Processing and Optimization in Oracle Rdb.
VLDB J. 5(4): 229-237(1996)

- [BDF+97]
- Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik:
The New Jersey Data Reduction Report.
IEEE Data Eng. Bull. 20(4): 3-45(1997)

- [BM96]
- Roberto J. Bayardo Jr., Daniel P. Miranker:
Processing Queries for First Few Answers.
CIKM 1996: 45-52

- [BMUT97]
- Sergey Brin, Rajeev Motwani, Jeffrey D. Ullman, Shalom Tsur:
Dynamic Itemset Counting and Implication Rules for Market Basket Data.
SIGMOD Conference 1997: 255-264

- [FJS97]
- Christos Faloutsos, H. V. Jagadish, Nikolaos Sidiropoulos:
Recovering Information from Summary Data.
VLDB 1997: 36-45

- [Fla85]
- Philippe Flajolet:
Approximate Counting: A Detailed Analysis.
BIT 25(1): 113-134(1985)

- [FM83]
- Philippe Flajolet, G. Nigel Martin:
Probabilistic Counting.
FOCS 1983: 76-82

- [FM85]
- Philippe Flajolet, G. Nigel Martin:
Probabilistic Counting Algorithms for Data Base Applications.
J. Comput. Syst. Sci. 31(2): 182-209(1985)

- [GM95]
- ...
- [GM97]
- ...
- [GMP97a]
- ...
- [GMP97b]
- Phillip B. Gibbons, Yossi Matias, Viswanath Poosala:
Fast Incremental Maintenance of Approximate Histograms.
VLDB 1997: 466-475

- [GPA+98]
- ...
- [HHW97]
- Joseph M. Hellerstein, Peter J. Haas, Helen J. Wang:
Online Aggregation.
SIGMOD Conference 1997: 171-182

- [HK95]
- ...
- [HNSS95]
- Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Lynne Stokes:
Sampling-Based Estimation of the Number of Distinct Values of an Attribute.
VLDB 1995: 311-322

- [IC93]
- Yannis E. Ioannidis, Stavros Christodoulakis:
Optimal Histograms for Limiting Worst-Case Error Propagation in the Size of Join Results.
ACM Trans. Database Syst. 18(4): 709-748(1993)

- [Ioa93]
- Yannis E. Ioannidis:
Universality of Serial Histograms.
VLDB 1993: 256-267

- [IP95]
- Yannis E. Ioannidis, Viswanath Poosala:
Balancing Histogram Optimality and Practicality for Query Result Size Estimation.
SIGMOD Conference 1995: 233-244

- [Mat92]
- ...
- [Mor78]
- Robert Morris:
Counting Large Numbers of Events in Small Registers.
Commun. ACM 21(10): 840-842(1978)

- [MSY96]
- ...
- [MVN93]
- ...
- [MVY94]
- ...
- [OR89]
- Frank Olken, Doron Rotem:
Random Sampling from B+ Trees.
VLDB 1989: 269-277

- [OR92]
- Frank Olken, Doron Rotem:
Maintenance of Materialized Views of Sampling Queries.
ICDE 1992: 632-641

- [PIHS96]
- Viswanath Poosala, Yannis E. Ioannidis, Peter J. Haas, Eugene J. Shekita:
Improved Histograms for Selectivity Estimation of Range Predicates.
SIGMOD Conference 1996: 294-305

- [Pre97]
- ...
- [Vit85]
- Jeffrey Scott Vitter:
Random Sampling with a Reservoir.
ACM Trans. Math. Softw. 11(1): 37-57(1985)

- [VL93]
- Susan V. Vrbsky, Jane W.-S. Liu:
APPROXIMATE - A Query Processor that Produces Monotonically Improving Approximate Answers.
IEEE Trans. Knowl. Data Eng. 5(6): 1056-1068(1993)

- [WVZT90]
- Kyu-Young Whang, Brad T. Vander Zanden, Howard M. Taylor:
A Linear-Time Probabilistic Counting Algorithm for Database Applications.
ACM Trans. Database Syst. 15(2): 208-229(1990)

Last update Tue Sep 18 00:25:17 2012
CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page