19. VLDB 1993: Dublin, Ireland - Tutorials
Recent advances in processor and memory technology have given rise to increases in computational performance that far outstrip increases in the performance of secondary storage technology. Coupled with emerging small-disk technology, disk arrays provide the cost, volume, and capacity of traditional disk subsystems but, by leveraging parallelism, many times their performance. This tutorial will describe the effect of various configurations of data striping on I/O performance.
Unfortunately, arrays of small disks may have much higher failure rates than single large disks they replace. Redundant Arrays of Inexpensive Disks (RAID) use simple redundancy schemes to provide high data reliability. This tutorial will describe models for the performance costs and reliability enhancements of various configurations of redundant disk arrays.
A rough outline of the tutorial is: Technology trends in secondary storage (shrinking form factors, processor versus disk performance trends); Motivation for and advantages of redundant disk arrays; Performance expectations for non-redundant disk arrays (disk striping - selecting a striping unit); Redundancy in disk arrays: motivation, taxonomy and performance implications; RAID levels, large write advantages, small write problem and solutions; Modelling redundant disk array reliability (orthogonal RAID, hot spares, dependent failure modes).Garth Gibson is an Assistant Professor in the School of Computer Science and the Department of Electrical and Computer Engineering at Carnegie Mellon University. Dr. Gibson is also Laboratory Director for Storage and Computer System Integration in the NSF sponsored Data Storage Systems Center. He recently completed his Ph.D. in Computer Science at UC Berkeley while exploring the reliability and performance of redundant disk arrays. His dissertation tied for second place in the 1991 ACM Doctoral Dissertation Award competition. Dr. Gibson received the 1989 Computer Measurement Group Award, and has held an IBM Graduate Fellowship as well as a Natural Science and Engineering Research Council of Canada Postgraduate Scholarship.
This tutorial will concentrate on those aspects of industrial-strength transaction processing that are largely ignored in the research literature. It should educate even attendees with only rudimentary knowledge of DBMS concepts about issues and solutions relating to intricate concurrency control and recovery problems. It will allow attendees to evaluate knowledgeably features of different systems in the future.
We discuss many subtleties involved in supporting very high concurrency with efficient storage management, indexing and recovery, and high availability. We present many recent research results in these areas. They apply to relational and non-relational systems. Throughout the presentation, in addition to the emphasis on concepts, great emphasis will be placed on performance and implementation concerns also. Topics to be covered:
Transactions. Buffering. Locking. Latching Recovery Methods. Shadow-Paging. Write-Ahead Logging. ARIES. High Availability. Online Archiving and Media Recovery. Indexing. Locking. Recovery. ARIES/KVL. ARIES/IM. Distributed Commit Protocol. X/Open and OSI Standard. "Presumed Abort". Products mentioned include: DB2, OS/2 EE DBM, NonStop SQL, Rdb, SQL/DS, IMS, Informix and Oracle.
The tutorial is intended for data base and transaction systems' designers, implementers and administrators, for users with high performance, availability and concurrency requirements, and for researchers in industry and academia.Dr. Mohan, a member of the Computer Science Department and the IBM Data Base Technology Institute, has been a Research Staff Member at the IBM Almaden Research Center since 1981. He was a designer and an implementer of the R* distributed DBMS and the Starburst extensible DBMS. He has lectured extensively, and authored numerous conference and journal papers. His research ideas have been incorporated in several IBM products. He is the recipient of IBM Outstanding Innovation Awards for several of his co-inventions in database technology and has had several patents issued. He was the Program Chair of the 2nd International Workshop on High Performance Transaction Systems and was an Associate Editor of IEEE's Database Engineering Quarterly. He has been on the program committees of the conferences SIGMOD, PODS, ICDCS, VLDB, PDIS, HPTS and Compcon.
This tutorial examines the major architectural decisions in building an object-oriented DBMS for today's new applications, and considers the impact on performance of different architectures. Much of what will be discussed is relevant for extended relational systems as well. Topics include:
The new applications, computing environment, benchmarks, and performance goals;
Getting Data Into Memory:
Four architectures (query-response, object server, page server, file server) and six performance issues: (OIDs, object fetching, buffering, format conversion, handles versus pointers, process boundaries);
Accessing Data in Memory:
Two paradigms (objects, pages) and ten performance issues (relationship to page and object servers, locking, authorisation, protection, references between objects, pointer swizzling, updates, size limitations, swapping, new trends in operating systems);
Storing Data Back on Disk:
Three architectural paradigms (object, page, and file servers) and four performance issues (recovery. merging changes from different transactions, format conversion, retaining locks after end-of-transaction).
Note: Attendees need not have previous background in OODBMS implementation techniques.Marianne Winslett is an Associate Professor in the Department of Computer Science at the University of Illinois. Her research is on computer-aided design, and automated reasoning, with an emphasis on architectures for next-generation database management systems and on database security. Professor Winslett received a Presidential Young Investigator Award in 1989 and a Xerox Junior Faculty Award in 1990, and was named a University Scholar by the University of Illinois in 1992. She has served on the program committee for ACM SIGMOD and other conferences in databases and automated reasoning, and is an associate editor of SIGMOD Record as well as the Secretary and Treasurer of ACM SIGART.
The vision of future information systems involves large numbers of heterogeneous, intelligent agents distributed over large computer/communication networks. Agents will request and acquire resources (e.g., processing, knowledge, data) without knowing what resources are required, how to acquire them, or how they will be orchestrated to achieve the desired result. The problems of legacy systems (i.e., existing systems often developed using, now ancient, technology) are more immediate. It is essential that the realisation of this vision be integrated into the current IS technology base. A challenge here is to develop technology that permits continuous enhancement and evolution of the current, massive investment in ISs. There are a large number of national and international consortia, guidelines, and standards that support initial responses to these requirements.
This tutorial provides a practical, intuitive, and conceptual understanding of interoperability in terms of case studies (legacy systems), a vision of the future, current trends and approaches, and research challenges. It provides a road map of the new area of interoperability and a basis for understanding. It addresses such topics as: distributed computing architectures and middleware, the repository, process re-engineering, re-use and reverse engineering, corporate information repositories, enterprise information architectures and integration, legacy system migration/evolution, distributed databases, transaction processing and monitors, object-orientation and gateways. Finally, the tutorial reviews some relevant standards activities, prototype systems, and products. The basic point of view is practical - You can't integrate it all. What is reasonable? Where do you start ? How little technology must I know to achieve our interoperability goals ?Dr. Michael L. Brodie heads the Distributed Object Computing Department (DOM) within the Computer and Intelligent Systems Laboratory of GTE Laboratories Inc., Waltham, Mass. A fundamental challenge in the research of this Laboratory is the integration of currently disjoint systems and technologies. Prior to his appointment at GTE Laboratories Inc., Michael was a Senior Scientist in the Research Division of Computer Corporation of America (CCA) and was on the faculties of the Computer Science Department at the University of Maryland and the University of Toronto. He is an associate editor of the International Journal of Intelligent & Cooperative Information Systems (World Scientific). Michael has authored over 70 books, journal articles, and refereed conference papers. He is a member of the ACM SIGMOD Advisory Committee (1989-present), and a Trustee of the VLDB Foundation, 1992-present. He was chairman of the ANSI Relational Database Task Force Group, 1979 - 1982 and a member of the Advisory Committee of the IRIS Division of the National Science Foundation, 1988-1991. He has given invited lectures and short courses on Database Technology, Information Engineering, CASE, Integrating AI and Database Technologies, Intelligent Information Systems, and Next Generation Database Technology in over a dozen countries.
This tutorial provides an overview of the existing ANSI and ISO SQL-92 Standard and the work being done on its successor, which has the working title of SQL3. SQL3 might be approved in 1995 or 1996.
SQL-92 offers many enhancements to its predecessor, SQL-89. Among these are additional data types, greater orthogonality, outer join, catalogues, domains, assertions, temporary tables, referential actions, Schema Manipulation Language, dynamic SQL, connections, and information schema tables.
SQL3 offers many enhancements to its predecessor, SQL-92. The non-procedural enhancements include triggers, recursive query expressions, and view updatability. Procedural enhancements include multi-statement procedures with variables, control flow, and exception handlers. Object-Oriented enhancements include abstract data types, encapsulation, object identity, inheritance and polymorphism.Andrew Eisenberg is an architect with the Database Systems Architecture Group, Digital Equipment Corporation at Nashua, NH. Andrew received his S.B. and S.M. degrees in Computer Science from Massachusetts Institute of Technology in 1982. In addition to managing the development of SQL-based products, he has participated actively as a Member or Alternate Member of the ANSI X3H2 Technical Committee on Database Languages since 1985. Andrew has contributed quite extensively to the design of both SQL2 and SQL3. Krishna Kulkarni is with the NonStop SQL Group, Tandem Computers Inc. at Cupertino, California, since December 1992. Prior to joining Tandem, Krishna worked for eight years as a researcher with the Database Systems Research Group, Digital Equipment Corporation at Colorado Springs, Colorado. Krishna received his PhD in Computer Science from University of Edinburgh (U.K.) in 1983. Krishna has contributed extensively to the design of object-oriented extensions in SQL3 and currently serves as a member of the ANSI X3H2 Technical Committee on Database Languages. Krishna is also a co-author of the book "Object-Oriented Databases: A Semantic Data Model Approach", published by Prentice Hall International Series in Computer Science.