dblp.uni-trier.de www.dagstuhl.de www.uni-trier.de

Considering Data Skew Factor in Multi-Way Join Query Optimization for Parallel Execution.

Kien A. Hua, Yu-lung Lo, Honesty C. Young: Considering Data Skew Factor in Multi-Way Join Query Optimization for Parallel Execution. VLDB J. 2(3): 303-330(1993)
@article{DBLP:journals/vldb/HuaLY93,
  author    = {Kien A. Hua and
               Yu-lung Lo and
               Honesty C. Young},
  title     = {Considering Data Skew Factor in Multi-Way Join Query Optimization
               for Parallel Execution},
  journal   = {VLDB J.},
  volume    = {2},
  number    = {3},
  year      = {1993},
  pages     = {303-330},
  ee        = {http://www.vldb.org/journal/VLDBJ2/P303.pdf},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

A consensus on parallel architecture for very large database management has emerged. This architecture is based on a shared-nothing hardware organization. The computation model is very sensitive to skew in tuple distribution, however. Recently, several parallel join algorithms with dynamic load balancing capabilities have been proposed to address this issue, but none of them consider multi-way join problems. In this article we propose a dynamic load balancing technique for multi-way joins, and investigate the effect of load balancing on query optimization. In particular, we present a join-ordering strategy that takes load-balancing issues into consideration. Our performance study indicates that the proposed query optimization technique can provide very impressive performance improvements over conventional approaches.

Copyright © 1993 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Key Words

Parallel-database computer, load balancing, multi-way join, query optimization.

References

[Boral et al. 1990]
Haran Boral, William Alexander, Larry Clay, George P. Copeland, Scott Danforth, Michael J. Franklin, Brian E. Hart, Marc G. Smith, Patrick Valduriez: Prototyping Bubba, A Highly Parallel Database System. IEEE Trans. Knowl. Data Eng. 2(1): 4-24(1990) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Chen et al. 1992]
Ming-Syan Chen, Ming-Ling Lo, Philip S. Yu, Honesty C. Young: Using Segmented Right-Deep Trees for the Execution of Pipelined Hash Joins. VLDB 1992: 15-26 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWitt et al. 1990]
David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Allan Bricker, Hui-I Hsiao, Rick Rasmussen: The Gamma Database Machine Project. IEEE Trans. Knowl. Data Eng. 2(1): 44-62(1990) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWitt et al. 1984]
David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeWitt et al. 1992]
David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri: Practical Skew Handling in Parallel Joins. VLDB 1992: 27-40 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Englert et al. 1990]
Susanne Englert, Jim Gray, Terrye Kocher, Praful Shah: A Benchmark of NonStop SQL Release 2 Demonstrating Near-Linear Speedup and Scaleup on Large Databases. SIGMETRICS 1990: 245-246 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Hong & Stonebraker 1991]
Wei Hong, Michael Stonebraker: Optimization of Parallel Query Execution Plans in XPRS. PDIS 1991: 218-225 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Hua & Lee 1990]
Kien A. Hua, Chiang Lee: An Adaptive Data Placement Scheme for Parallel Database Computer Systems. VLDB 1990: 493-506 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Hua & Lee 1991]
Kien A. Hua, Chiang Lee: Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning. VLDB 1991: 525-535 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Hua et al. 1993]
Kien A. Hua, Yu-lung Lo, Honesty C. Young: Including the Load Balancing Issue in the Optimization of Multi-way Join Queries for Shared-Nothing Database Computers. PDIS 1993: 74-83 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Hua & Young 1990]
...
[Jordan 1978]
...
[Kitsuregawa & Ogawa 1990]
Masaru Kitsuregawa, Yasushi Ogawa: Bucket Spreading Parallel Hash: A New, Robust, Parallel Hash Join Method for Data Skew in the Super Database Computer (SDC). VLDB 1990: 210-221 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Kitsuregawa et al. 1983]
Masaru Kitsuregawa, Hidehiko Tanaka, Tohru Moto-Oka: Application of Hash to Data Base Machine and Its Architecture. New Generation Comput. 1(1): 63-74(1983) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Lakshmi & Yu 1988]
M. Seetha Lakshmi, Philip S. Yu: Effect of Skew on Join Performance in Parallel Architectures. DPDS 1988: 107-120 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Lakshmi & Yu 1989]
M. Seetha Lakshmi, Philip S. Yu: Limiting Factors of Join Performance on Parallel Processors. ICDE 1989: 488-496 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Lorie et al. 1991]
...
[Lu et al. 1991]
Hongjun Lu, Ming-Chien Shan, Kian-Lee Tan: Optimization of Multi-Way Join Queries for Parallel Execution. VLDB 1991: 549-560 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Sacco 1986]
Giovanni Maria Sacco: Fragmentation: A Technique for Efficient Query Processing. ACM Trans. Database Syst. 11(2): 113-133(1986) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Schneider & DeWitt 1989]
Donovan A. Schneider, David J. DeWitt: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. SIGMOD Conference 1989: 110-121 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Schneider & DeWitt 1990]
Donovan A. Schneider, David J. DeWitt: Tradeoffs in Processing Complex Join Queries via Hashing in Multiprocessor Database Machines. VLDB 1990: 469-480 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Selinger et al. 1979]
Patricia G. Selinger, Morton M. Astrahan, Donald D. Chamberlin, Raymond A. Lorie, Thomas G. Price: Access Path Selection in a Relational Database Management System. SIGMOD Conference 1979: 23-34 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Stonebraker 1986]
Michael Stonebraker: The Case for Shared Nothing. IEEE Database Eng. Bull. 9(1): 4-9(1986) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Stonebraker 1988]
Michael Stonebraker, Randy H. Katz, David A. Patterson, John K. Ousterhout: The Design of XPRS. VLDB 1988: 318-330 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Su 1983]
...
[Swami et al. 1992]
...
[Tan & Lu 1992]
...
[Teradata Corporation 1988]
...
[Turbyfill 1987]
...
[Walton et al. 1991]
Christopher B. Walton, Alfred G. Dale, Roy M. Jenevein: A Taxonomy and Performance Model of Data Skew Effects in Parallel Joins. VLDB 1991: 537-548 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Wolf et al. 1990]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu: An Effective Algorithm for Parallelizing Sort Merge in the Presence of Data Skew. DPDS 1990: 103-115 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Wolf et al. 1991a]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu, John Turek: An Effective Algorithm for Parallelizing Hash Joins in the Presence of Data Skew. ICDE 1991: 200-209 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Wolf et al. 1991b]
Joel L. Wolf, Daniel M. Dias, Philip S. Yu, John Turek: Comparative Performance of Parallel Join Algorithms. PDIS 1991: 78-88 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Wong & Youssefi 1976]
Eugene Wong, Karel Youssefi: Decomposition - A Strategy for Query Processing. ACM Trans. Database Syst. 1(3): 223-241(1976) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Zipf 1949]
George Kingsley Zipf: Human Behaviour and the Principle of Least Effort: an Introduction to Human Ecology. Addison-Wesley 1949
CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Last update Fri Sep 14 18:29:07 2012 CET by the DBLP TeamThis material is Open Data Data released under the ODC-BY 1.0 license — See also our legal information page