dblp.uni-trier.de www.dagstuhl.de www.uni-trier.de

Computation of Partial Query Results Using An Adaptive Stratified Sampling Technique.

Augustine C. Ikeji, Farshad Fotouhi: Computation of Partial Query Results Using An Adaptive Stratified Sampling Technique. CIKM 1995: 145-149
@inproceedings{DBLP:conf/cikm/IkejiF95,
  author    = {Augustine C. Ikeji and
               Farshad Fotouhi},
  title     = {Computation of Partial Query Results Using An Adaptive Stratified
               Sampling Technique},
  booktitle = {CIKM '95, Proceedings of the 1995 International Conference on
               Information and Knowledge Management, November 28 - December
               2, 1995, Baltimore, Maryland, USA},
  publisher = {ACM},
  year      = {1995},
  pages     = {145-149},
  ee        = {http://doi.acm.org/10.1145/221270.221363},
  crossref  = {DBLP:conf/cikm/95},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

The proliferation of databases into many areas has resulted in the need for fast and efficient techniques for query processing. Large databases take too much time to process, and time constrained queries force the query processor to render a result in the available time. For applications such as these, partial query results (PQR) may be satisfactory. Sampling techniques have been used to estimate aggregate query results [3], [5], [6], [7], [13], however a direct application of the sampling algorithms to obtain partial results in non-aggregate queries is not optimal because the goals are different.

In this paper, we propose an algorithm for obtaining PQR for non-aggregate queries. Our algorithm is based on stratification of the target relation and sampling of the strata over several stages. The results obtained from the preceding stages are used to guide the choice of which strata to sample in the following stages. In other words, strata that possess more of the desired quality (rich) get sampled more at the beginning of the query processing while those that are poor may be sampled at the later stages if time permits. The experimental performance of our algorithm is presented and compared with other techniques for obtaining partial or entire query results. The results show that our algorithm out-performs the other algorithms by finding more tuples that belong to the result set in the early stages of the query processing especially when the query selectivity is high or the sample size is at least five percent of the population size.

Copyright © 1995 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 2 Issue 4, CIKM, DOLAP, GIS, SIGFIDET, ..." and ... DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

CIKM '95, Proceedings of the 1995 International Conference on Information and Knowledge Management, November 28 - December 2, 1995, Baltimore, Maryland, USA. ACM 1995
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Online Edition

Citation Page

References

[1]
Arie Shoshani, Harry K. T. Wong: Statistical and Scientific Database Issues. IEEE Trans. Software Eng. 11(10): 1040-1047(1985) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[2]
Neil C. Rowe: Antisampling for Estimation: An Overview. IEEE Trans. Software Eng. 11(10): 1081-1091(1985) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[3]
Peter J. Haas, Arun N. Swami: Sequential Sampling Procedures for Query Size Estimation. SIGMOD Conference 1992: 341-350 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[4]
Y. C. Tay: On the Optimality of Strategies for Multiple Joins. J. ACM 40(5): 1067-1086(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[5]
Frank Olken, Doron Rotem: Simple Random Sampling from Relational Databases. VLDB 1986: 160-169 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[6]
Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider: Practical Selectivity Estimation through Adaptive Sampling. SIGMOD Conference 1990: 1-11 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[7]
Stavros Christodoulakis: Estimating record selectivities. Inf. Syst. 8(2): 105-115(1983) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[8]
...
[9]
...
[10]
...
[11]
...
[12]
...
[13]
Frank Olken, Doron Rotem: Random Sampling from B+ Trees. VLDB 1989: 269-277 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[14]
Yibei Ling, Wei Sun: A Suppletment to Sampling-Based Methods for Query Size Estimation in a Database System. SIGMOD Record 21(4): 12-15(1992) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Last update Thu Sep 13 03:06:53 2012 CET by the DBLP TeamThis material is Open Data Data released under the ODC-BY 1.0 license — See also our legal information page