ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Data Placement In Bubba.

George P. Copeland, William Alexander, Ellen E. Boughter, Tom W. Keller: Data Placement In Bubba. SIGMOD Conference 1988: 99-108
@inproceedings{DBLP:conf/sigmod/CopelandABK88,
  author    = {George P. Copeland and
               William Alexander and
               Ellen E. Boughter and
               Tom W. Keller},
  editor    = {Haran Boral and
               Per-{\AA}ke Larson},
  title     = {Data Placement In Bubba},
  booktitle = {Proceedings of the 1988 ACM SIGMOD International Conference on
               Management of Data, Chicago, Illinois, June 1-3, 1988},
  publisher = {ACM Press},
  year      = {1988},
  pages     = {99-108},
  ee        = {http://doi.acm.org/10.1145/50202.50213, db/conf/sigmod/CopelandABK88.html},
  crossref  = {DBLP:conf/sigmod/88},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

This paper examines the problem of data placement in Bubba, a highly-parallel system for data-intensive applications being developed at MCC. "Highly-parallel" implies that load balancing is a critical performance issue. "Data-intensive" means data is so large that operations should be executed where the data resides. As a result, data placement becomes a critical performance issue.

In general, determining the optimal placement of data across processing nodes for performance is a difficult problem. We describe our heuristic approach to solving the data placement problem in Bubba. We then present experimental results using a specific workload to provide insight into the problem. Several researchers have argued the benefits of declustering (i. e., spreading each base relation over many nodes). We show that as declustering is increased, load balancing continues to improve. However, for transactions involving complex joins, further declustering reduces throughput because of communications, startup and termination overhead.

We argue that data placement, especially declustering, in a highly-parallel system must be considered early in the design, so that mechanisms can be included for supporting variable declustering, for minimizing the most significant overheads associated with large-scale declustering, and for gathering the required statistics.

Copyright © 1988 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

Online Version (ACM WWW Account required): Full Text in PDF Format

CDROM Version: Load the CDROM "Volume 1 Issue 2, SIGMOD '75-'92" and ...

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

Haran Boral, Per-Åke Larson (Eds.): Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data, Chicago, Illinois, June 1-3, 1988. ACM Press 1988 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML, SIGMOD Record 17(2), June 1988
Contents

Online Edition: ACM Digital Library


References

[Ale87]
William Alexander, Tom W. Keller, Ellen E. Boughter: A Workload Characterization Pipeline for Models of Parallel Systems. SIGMETRICS 1987: 186-194 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Ale88]
William Alexander, George P. Copeland: Comparison of Dataflow Control Techniques In Distributed Data-Intensive Systems. SIGMETRICS 1988: 157-166 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[AlC88]
William Alexander, George P. Copeland: Process And Dataflow Control In Distributed Data-Intensive Systems. SIGMOD Conference 1988: 90-98 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Ano85]
...
[Att84]
Rony Attar, Philip A. Bernstein, Nathan Goodman: Site Initialization, Recovery, and Backup in a Distributed Database System. IEEE Trans. Software Eng. 10(6): 645-650(1984) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Bat82]
Don S. Batory: Optimal File Designs and Reorganization Points. ACM Trans. Database Syst. 7(1): 60-81(1982) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Bou87]
...
[Bun84]
Richard B. Bunt, Jennifer M. Murphy, Shikharesh Majumdar: A Measure of Program Locality and Its Application. SIGMETRICS 1984: 28-40 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Chu69]
...
[Cve87]
Zarka Cvetanovic: The Effects of Problem Partitioning, Allocation, and Granularity on the Performance of Multiple-Processor Systems. IEEE Trans. Computers 36(4): 421-432(1987) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Den78]
Peter J. Denning, Jeffrey P. Buzen: The Operational Analysis of Queueing Network Models. ACM Comput. Surv. 10(3): 225-261(1978) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeW86]
David J. DeWitt, Robert H. Gerber, Goetz Graefe, Michael L. Heytens, Krishna B. Kumar, M. Muralikrishna: GAMMA - A High Performance Dataflow Database Machine. VLDB 1986: 228-237 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[DeW87]
David J. DeWitt, Shahram Ghandeharizadeh, Donovan A. Schneider, Rajiv Jauhari, M. Muralikrishna, Anoop Sharma: A Single User Evaluation of the Gamma Database Machine. IWDM 1987: 370-386 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Eas74]
Kapali P. Eswaran: Placement of Records in a File and File Allocation in a Computer. IFIP Congress 1974: 304-307 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Flo78]
André Flory, J. Gunther, Jacques Kouloumdjian: Data Base Reorganization by Clustering Methods. Inf. Syst. 3(1): 59-62(1978) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Gra78]
Jim Gray: Notes on Data Base Operating Systems. Advanced Course: Operating Systems 1978: 393-481 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Gra87]
Jim Gray, Gianfranco R. Putzolu: The 5 Minute Rule for Trading Memory for Disk Accesses and The 10 Byte Rule for Trading Memory for CPU Time. SIGMOD Conference 1987: 395-398 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Hwa84]
...
[Jak80]
Matti Jakobsson: Reducing block accesses in inverted files by partial clustering. Inf. Syst. 5(1): 1-5(1980) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Kat78]
...
[Laz84]
...
[Liv87]
Miron Livny, Setrag Khoshafian, Haran Boral: Multi-Disk Management Algorithms. SIGMETRICS 1987: 69-77 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Mah76]
Samy A. Mahmoud, J. Spruce Riordon: Optimal Allocation of Resources in Distributed Information Networks. ACM Trans. Database Syst. 1(1): 66-78(1976) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Mar76]
K. Maruyama, S. E. Smith: Optimal Reorganization of Distributed Space Disk Files. Commun. ACM 19(11): 634-642(1976) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Muk87]
Ravi Mukkamala, Steven C. Bruell, Roger K. Shultz: Design of Partially Replicated Distributed Database Systems: An Integrated Methodology. SIGMETRICS 1988: 187-196 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Omi83]
...
[Sam87]
...
[Shn73]
Ben Shneiderman: Optimum Data Base Reorganization Points. Commun. ACM 16(6): 362-365(1973) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Soc79]
Gary H. Sockut, Robert P. Goldberg: Database Reorganization - Principles and Practice. ACM Comput. Surv. 11(4): 371-395(1979) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Sto86]
Michael Stonebraker: The Case for Shared Nothing. IEEE Database Eng. Bull. 9(1): 4-9(1986) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Tan87]
Tandem Database Group - NonStop SQL: A Distributed, High-Performance, High-Availability Implementation of SQL. HPTS 1987: 60-104 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Ter85]
...
[Tue78]
William G. Tuel Jr.: Optimum Reorganization Points for Linearly Growing Files. ACM Trans. Database Syst. 3(1): 32-40(1978) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Vrs85]
...
[Yao76]
S. Bing Yao, K. Sundar Das, Toby J. Teorey: A Dynamic Database Reorganization Algorithm. ACM Trans. Database Syst. 1(2): 159-174(1976) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[Yu85]
Clement T. Yu, Cheing-Mei Suen, K. Lam, M. K. Siu: Adaptive Record Clustering. ACM Trans. Database Syst. 10(2): 180-204(1985) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Sun Nov 15 05:11:45 2009 by Michael Ley (ley@uni-trier.de)