30. VLDB 2004: Toronto, Canada
Mario A. Nascimento, M. Tamer Özsu, Donald Kossmann, Renée J. Miller, José A. Blakeley, K. Bernhard Schiefer (Eds.): (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, Toronto, Canada, August 31 - September 3 2004. Morgan Kaufmann 2004 ISBN 0-12-088469-0

Keynotes
David Yach: Databases in a Wireless World. 3
Alon Y. Halevy: Structures, Semantics and Statistics. 4-6
Ten Year Best Paper Award
Research Sessions
Research Session 1: Compression & Indexing
David S. Johnson, Shankar Krishnan, Jatin Chhugani, Subodh Kumar, Suresh Venkatasubramanian: Compressing Large Boolean Matrices using Reordering Techniques. 13-23
Kesheng Wu, Ekow J. Otoo, Arie Shoshani: On the performance of bitmap indices for high cardinality attributes. 24-35
Research Session 2: XML Views and Schemas

Andrey Balmin, Fatma Özcan, Kevin S. Beyer, Roberta Cochrane, Hamid Pirahesh: A Framework for Using Materialized XPath Views in XML Query Processing. 60-71
Research Session 3: Controlling Access
Luc Bouganim, François Dang Ngoc, Philippe Pucheral: Client-Based Access Control Management for XML documents. 84-95
Xiaochun Yang, Chen Li: Secure XML Publishing without Information Leakage in the Presence of Data Inference. 96-107
Kristen LeFevre, Rakesh Agrawal, Vuk Ercegovac, Raghu Ramakrishnan, Yirong Xu, David J. DeWitt: Limiting Disclosure in Hippocratic Databases. 108-119
Research Session 4: XML (I)
Laks V. S. Lakshmanan, Ganesh Ramesh, Wendy Hui Wang, Zheng (Jessica) Zhao: On Testing Satisfiability of Tree Pattern Queries. 120-131
Rajasekar Krishnamurthy, Raghav Kaushik, Jeffrey F. Naughton: Efficient XML-to-SQL Query Translation: Where to Add the Intelligence? 144-155

Research Session 5: Stream Mining

Shanzhong Zhu, Chinya V. Ravishankar: Stochastic Consistency, and Scalable Pull-Based Caching for Erratic Data Sources. 192-203
Jeffrey Xu Yu, Zhihong Chong, Hongjun Lu, Aoying Zhou: False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams. 204-215
Research Session 6: XML (II)
Alberto O. Mendelzon, Flavio Rizzolo, Alejandro A. Vaisman: Indexing Temporal XML Documents. 216-227
Christoph Koch, Stefanie Scherzinger, Nicole Schweikardt, Bernhard Stegmaier: Schema-based Scheduling of Event Processors and Buffer Minimization for Queries on Structured Data Streams. 228-239
Wei Wang, Haifeng Jiang, Hongjun Lu, Jeffrey Xu Yu: Bloom Histogram: Path Selectivity Estimation for XML Data with Updates. 240-251
Research Session 7: XML and Relations

Alan Halverson, Vanja Josifovski, Guy M. Lohman, Hamid Pirahesh, Mathias Mörschel: ROX: Relational Over XML. 264-275
Vanessa P. Braganholo, Susan B. Davidson, Carlos A. Heuser: From XML View Updates to Relational View Updates: old solutions to a new problem. 276-287
Research Session 8: Stream Mining (II)
Sudipto Guha, Chulyun Kim, Kyuseok Shim: XWAVE: Approximate Extended Wavelets for Streaming Data. 288-299
Sudipto Guha, Kyuseok Shim, Jungchul Woo: REHIST: Relative Error Histogram Construction Algorithms. 300-311
Abhinandan Das, Sumit Ganguly, Minos N. Garofalakis, Rajeev Rastogi: Distributed Set Expression Cardinality Estimation. 312-323
Research Session 9: Stream Query Processing


Sirish Chandrasekaran, Michael J. Franklin: Remembrance of Streams Past: Overload-Sensitive Management of Archived Streams. 348-359
Sandeep Pandey, Kedar Dhamdhere, Christopher Olston: WIC: A General-Purpose Algorithm for Monitoring Web Information Sources. 360-371
Research Session 10: Managing Web Information Sources
Xin Dong, Alon Y. Halevy, Jayant Madhavan, Ema Nemes, Jun Zhang: Simlarity Search for Web Services. 372-383
Andreas Thor, Erhard Rahm: AWESOME - A Data Warehouse-based System for Adaptive Website Recommendations. 384-395
Martin Ester, Hans-Peter Kriegel, Matthias Schubert: Accurate and Efficient Crawling for Relevant Websites. 396-407
Jiying Wang, Ji-Rong Wen, Frederick H. Lochovsky, Wei-Ying Ma: Instance-based Schema Matching for Web Databases by Domain-specific Query Probing. 408-419
Research Session 11: Distributed Search and Query Processing
Yuan Wang, David J. DeWitt: Computing PageRank in a Distributed Internet Search Engine System. 420-431
Boon Thau Loo, Joseph M. Hellerstein, Ryan Huebsch, Scott Shenker, Ion Stoica: Enhancing P2P File-Sharing with an Internet-Scale Query Processor. 432-443
Prasanna Ganesan, Mayank Bawa, Hector Garcia-Molina: Online Balancing of Range-Partitioned Data with Applications to Peer-to-Peer Systems. 444-455
Yanif Ahmad, Ugur Çetintemel: Networked Query Processing for Distributed Stream-Based Applications. 456-467
Anastasios Kementsietsidis, Marcelo Arenas: Data Sharing Through Query Translation in Autonomous Sources. 468-479
Research Session 12: Stream Data Management Systems
Arvind Arasu, Mitch Cherniack, Eduardo F. Galvez, David Maier, Anurag Maskey, Esther Ryvkina, Michael Stonebraker, Richard Tibbetts: Linear Road: A Stream Data Management Benchmark. 480-491
Yan-Nei Law, Haixun Wang, Carlo Zaniolo: Query Languages and Data Models for Database Sequences and Data Streams. 492-503
Research Session 13: Auditing
Richard T. Snodgrass, Shilong (Stanley) Yao, Christian S. Collberg: Tamper Detection in Audit Logs. 504-515
Rakesh Agrawal, Roberto J. Bayardo Jr., Christos Faloutsos, Jerry Kiernan, Ralf Rantzau, Ramakrishnan Srikant: Auditing Compliance with a Hippocratic Database. 516-527
Research Session 14: Data Warehousing
Research Session 15: Link Analysis

Andrey Balmin, Vagelis Hristidis, Yannis Papakonstantinou: ObjectRank: Authority-Based Keyword Search in Databases. 564-575
Research Session 16: Sensors, Grid, Pub/Sub
Amol Deshpande, Carlos Guestrin, Samuel Madden, Joseph M. Hellerstein, Wei Hong: Model-Driven Data Acquisition in Sensor Networks. 588-599
David T. Liu, Michael J. Franklin: The Design of GridDB: A Data-Centric Overlay for the Scientific Grid. 600-611
Yanlei Diao, Shariq Rizvi, Michael J. Franklin: Towards an Internet-Scale XML Dissemination Service. 612-623
Research Session 17: Top-K Ranking
Pavan Kumar C. Singitham, Mahathi S. Mahabhashyam, Prabhakar Raghavan: Efficiency-Quality Tradeoffs for Vector Score Aggregation. 624-635
Sudipto Guha, Nick Koudas, Amit Marathe, Divesh Srivastava: Merging the Results of Approximate Match Operations. 636-647
Martin Theobald, Gerhard Weikum, Ralf Schenkel: Top-k Query Evaluation with Probabilistic Guarantees. 648-659
Research Session 18: DBMS Architecture and Performance
Stavros Harizopoulos, Anastassia Ailamaki: STEPS towards Cache-resident Transaction Processing. 660-671
Goetz Graefe: Write-Optimized B-Trees. 672-683
Minglong Shao, Jiri Schindler, Steven W. Schlosser, Anastassia Ailamaki, Gregory R. Ganger: Clotho: Decoupling memory page layout from storage organization. 696-707
Research Session 19: Privacy
Gagan Aggarwal, Mayank Bawa, Prasanna Ganesan, Hector Garcia-Molina, Krishnaram Kenthapadi, Nina Mishra, Rajeev Motwani, Utkarsh Srivastava, Dilys Thomas, Jennifer Widom, Ying Xu: Vision Paper: Enabling Privacy for the Paranoids. 708-719
Radu Sion, Mikhail J. Atallah, Sunil Prabhakar: Resilient Rights Protection for Sensor Streams. 732-743
Research Session 20: Nearest Neighbor Search

Chenyi Xia, Hongjun Lu, Beng Chin Ooi, Jin Hu: Gorder: An Efficient Method for KNN Join Processing. 756-767
Christian S. Jensen, Dan Lin, Beng Chin Ooi: Query and Update Efficient B+-Tree Based Indexing of Moving Objects. 768-779
Research Session 21: Similarity Search and Applications
Eamonn J. Keogh, Themis Palpanas, Victor B. Zordan, Dimitrios Gunopulos, Marc Cardle: Indexing Large Human-Motion Databases. 780-791
Nick Koudas, Beng Chin Ooi, Kian-Lee Tan, Rui Zhang: Approximate NN queries on Streams with Guaranteed Error/performance Bounds. 804-815
Catriel Beeri, Yaron Kanza, Eliyahu Safra, Yehoshua Sagiv: Object Fusion in Geographic Information Systems. 816-827
Glenn S. Iwerks, Hanan Samet, Kenneth P. Smith: Maintenance of Spatial Semijoin Queries on Moving Points. 828-839
Mohammad R. Kolahdouzan, Cyrus Shahabi: Voronoi-Based K Nearest Neighbor Search for Spatial Network Databases. 840-851
Charu C. Aggarwal, Jiawei Han, Jianyong Wang, Philip S. Yu: A Framework for Projected Clustering of High Dimensional Data Streams. 852-863
Research Session 22: Query Processing

Reynold Cheng, Yuni Xia, Sunil Prabhakar, Rahul Shah, Jeffrey Scott Vitter: Efficient Indexing Methods for Probabilistic Threshold Queries over Uncertain Data. 876-887
Surajit Chaudhuri, Gautam Das, Vagelis Hristidis, Gerhard Weikum: Probabilistic Ranking of Database Query Results. 888-899
Research Session 23: Novel Models
Deepavali Bhagwat, Laura Chiticariu, Wang Chiew Tan, Gaurav Vijayvargiya: An Annotation Management System for Relational Databases. 900-911
Kenneth A. Ross, Julia Stoyanovich: Symmetric Relations and Cardinality-Bounded Multisets in Database Systems. 912-923
Research Session 24: Query Processing and Optimization

Amol Deshpande, Joseph M. Hellerstein: Lifting the Burden of History from Adaptive Query Processing. 948-959
Sailesh Krishnamurthy, Michael J. Franklin, Joseph M. Hellerstein, Garrett Jacobson: The Case for Precision Sharing. 972-986
Industrial and Application Sessions
Industrial Session 1: Novel SQL Extensions
Andreas Behm, Serge Rielau, Richard Swagerman: Returning Modified Rows - SELECT Statements with Side Effects. 987-997
Conor Cunningham, Goetz Graefe, César A. Galindo-Legaria: PIVOT and UNPIVOT: Optimization and Execution Strategies in an RDBMS. 998-1009
Walid Rjaibi, Paul Bird: A Multi-Purpose Implementation of Mandatory Access Control in Relational Database Management Systems. 1010-1020
Industrial Session 2: New DBMS Architectures and Performance
Nagender Bandi, Chengyu Sun, Amr El Abbadi, Divyakant Agrawal: Hardware Acceleration in Commercial Databases: A Case Study of Spatial Operations. 1021-1032
Sang Kyun Cha, Changbin Song: P*TIME: Highly Scalable OLTP DBMS for Managing Update-Intensive Stream Workload. 1033-1044
Industrial Session 3: Semantic Query Approaches
Souripriya Das, Eugene Inseok Chong, George Eadon, Jagannathan Srinivasan: Supporting Ontology-Based Semantic matching in RDBMS. 1054-1065
Sougata Mukherjea, Bhuvan Bamba: BioPatentMiner: An Information Retrieval System for BioMedical Patents. 1066-1077
Nick Koudas, Amit Marathe, Divesh Srivastava: Flexible String Matching Against Large Databases in Practice. 1078-1086
Industrial Session 4: Automatic Tuning in Commercial DBMSs
Daniel C. Zilio, Jun Rao, Sam Lightstone, Guy M. Lohman, Adam J. Storm, Christian Garcia-Arellano, Scott Fadden: DB2 Design Advisor: Integrated Automatic Physical Database Design. 1087-1097
Benoît Dageville, Dinesh Das, Karl Dias, Khaled Yagoub, Mohamed Zaït, Mohamed Ziauddin: Automatic SQL Tuning in Oracle 10g. 1098-1109
Sanjay Agrawal, Surajit Chaudhuri, Lubor Kollár, Arunprasad P. Marathe, Vivek R. Narasayya, Manoj Syamala: Database Tuning Advisor for Microsoft SQL Server 2005. 1110-1121
Industrial Session 5: XML Implementations, Automatic Physical Design and Indexing
Muralidhar Krishnaprasad, Zhen Hua Liu, Anand Manikutty, James W. Warner, Vikas Arora, Susan Kotsovolos: Query Rewrite for XML in Oracle XML DB. 1122-1133
Shankar Pal, Istvan Cseri, Gideon Schaller, Oliver Seeliger, Leo Giakoumakis, Vasili Vasili Zolotov: Indexing XML Data Stored in a Relational Database. 1134-1145
Ashraf Aboulnaga, Peter J. Haas, Sam Lightstone, Guy M. Lohman, Volker Markl, Ivan Popivanov, Vijayshankar Raman: Automated Statistics Collection in DB2 UDB. 1146-1157
Marcus Fontoura, Eugene J. Shekita, Jason Y. Zien, Sridhar Rajagopalan, Andreas Neumann: High Performance Index Build Algorithms for Intranet Search Engines. 1158-1169
Sam Lightstone, Bishwaranjan Bhattacharjee: Automating the design of multi-dimensional clustering tables in relational databases. 1170-1181
Industrial Session 6: Data Management with RFIDs and Ease of Use
Christof Bornhövd, Tao Lin, Stephan Haller, Joachim Schaper: Integrating Automatic Data Acquisition with Business Processes - Experiences with SAP's Auto-ID Infrastructure. 1182-1188
Sudarshan S. Chawathe, Venkat Krishnamurthy, Sridhar Ramachandran, Sanjay E. Sarma: Managing RDFI Data. 1189-1195
David Campbell: Production Database Systems: Making Them Easy is Hard Work. 1196-1197
Industrial Session 7: Data Management Challenges in Life Sciences and Email Systems
Toby Bloom, Ted Sharpe: Managing Data from High-Throughput Genomic Processing: A Case Study. 1198-1201
Rakesh Nagarajan, Mushtaq Ahmed, Aditya Phatak: Database Challenges in the Integration of Biomedical Data Sets. 1202-1213
Industrial Session 8: Issues in Data Warehousing
William O'Connell: Trends in Data Warehousing: A Practitioner's View. 1224
Ramesh Bhashyam: Technology Challenges in a Data Warehouse. 1225-1226
Panel Sessions
Thodoros Topaloglou, Susan B. Davidson, H. V. Jagadish, Victor M. Markowitz, Evan W. Steeg, Mike Tyers: Biological Data Management: Research, Practice and Opportunities. 1233-1236
William O'Connell, Andrew Witkowski, Ramesh Bhashyam, Surajit Chaudhuri: Where is Business Intelligence taking today's Database Systems? 1237-1238
Tutorial Sessions
Anastassia Ailamaki: Database Architecture for New Hardware. 1241
Arnon Rosenthal, Marianne Winslett: Security of Shared Data in Large Systems: State of the Art and Research Directions. 1242
Surajit Chaudhuri, Benoît Dageville, Guy M. Lohman: Self-Managing Technology in Database Management Systems. 1243
Joseph M. Hellerstein: Architectures and Algorithms for Internet-Scale (P2P) Data Management. 1244
Demonstrations


Wei Fan: StreamMiner: A Classifier Ensemble-based Engine to Mine Concept-drifting Data Streams. 1257-1260
Xin Xu, Gao Cong, Beng Chin Ooi, Kian-Lee Tan, Anthony K. H. Tung: Semantic Mining and Analysis of Gene Expression Data. 1261-1264
Ji Zhang, Meng Lou, Tok Wang Ling, Hai H. Wang: HOS-Miner: A System for Detecting Outlying Subspaces of High-dimensional Data. 1265-1268
Jessica Lin, Eamonn J. Keogh, Stefano Lonardi, Jeffrey P. Lankford, Donna M. Nystrom: VizTree: a Tool for Visually Mining and Monitoring Massive Time Series Databases. 1269-1272
Serge Abiteboul, Bogdan Alexe, Omar Benjelloun, Bogdan Cautis, Irini Fundulaki, Tova Milo, Arnaud Sahuguet: An Electronic Patient Record "on Steroids": Distributed, Peer-to-Peer, Secure and Privacy-conscious. 1273-1276
Enrico Franconi, Gabriel M. Kuper, Andrei Lopatenko, Ilya Zaihrayeu: Queries and Updates in the coDB Peer to Peer Database System. 1277-1280
Haifeng Liu, Hans-Arno Jacobsen: A-ToPSS: A Publish/Subscribe System Supporting Imperfect Information Processing. 1281-1284
Zhengdao Xu, Hans-Arno Jacobsen: Efficient Constraint Processing for Highly Personalized Location Based Services. 1285-1288
Witold Litwin, Rim Moussa, Thomas J. E. Schwarz: LH*RS: A Highly Available Distributed Data Storage. 1289-1292
Hong Su, Elke A. Rundensteiner, Murali Mani: Semantic Query Optimization in an Automata-Algebra Combined XQuery Engine over XML Streams. 1293-1296
Fang Du, Sihem Amer-Yahia, Juliana Freire: ShreX: Managing XML Documents in Relational Databases. 1297-1300
Byron Choi, Wenfei Fan, Xibei Jia, Arek Kasprzyk: A Uniform System for Publishing and Maintaining XML Data. 1301-1304
Sabine Mayer, Torsten Grust, Maurice van Keulen, Jens Teubner: An Injection of Tree Awareness: Adding Staircase Join to PostgreSQL. 1305-1308
Christoph Koch, Stefanie Scherzinger, Nicole Schweikardt, Bernhard Stegmaier: FluXQuery: An Optimizing XQuery Processor for Streaming XML Data. 1309-1312
Jens Graupmann, Michael Biwer, Christian Zimmer, Patrick Zimmer, Matthias Bender, Martin Theobald, Gerhard Weikum: COMPASS: A Concept-based Web Search Engine for HTML, XML, and Deep Web Data. 1313-1316
Christian Halaschek-Wiener, Boanerges Aleman-Meza, Ismailcem Budak Arpinar, Amit P. Sheth: Discovering and Ranking Semantic Associations over a Large RDF Metabase. 1317-1320
Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo, Paolo Missier: An Automatic Data Grabber for Large Web Sites. 1321-1324
Karim Baïna, Boualem Benatallah, Hye-Young Paik, Farouk Toumani, Christophe Rey, Agnieszka Rutkowska, Bryan Harianto: WS-CatalogNet: An Infrastructure for Creating, Peering, and Querying e-Catalog Communities. 1325-1328
Halvard Skogsrud, Boualem Benatallah, Fabio Casati, Manh Q. Dinh: Trust-Serv: A Lightweight Trust Negotiation Service. 1329-1332
Parag Sarda, Jayant R. Haritsa: Green Query Optimization: Taming Query Optimization Overheads through Plan Recycling. 1333-1336
Vijayshankar Raman, Volker Markl, David E. Simmen, Guy M. Lohman, Hamid Pirahesh: Progressive Optimization in Action. 1337-1340
Ihab F. Ilyas, Volker Markl, Peter J. Haas, Paul G. Brown, Ashraf Aboulnaga: CORDS: Automatic Generation of Correlation Statistics in DB2. 1341-1344
Tobias Kraft, Holger Schwarz: CHICAGO: A Test and Evaluation Environment for Coarse-Grained Optimization. 1345-1348
Ernest Teniente, Carles Farré, Toni Urpí, Carlos Beltrán, David Gañán: SVT: Schema Validation Tool for Microsoft SQL-Server. 1349-1352
Elke A. Rundensteiner, Luping Ding, Timothy M. Sutherland, Yali Zhu, Bradford Pielech, Nishant K. Mehta: CAPE: Continuous Query Engine with Heterogeneous-Grained Adaptivity. 1353-1356
Owen Cooper, Anil Edakkunni, Michael J. Franklin, Wei Hong, Shawn R. Jeffery, Sailesh Krishnamurthy, Frederick Reiss, Shariq Rizvi, Eugene Wu: HiFi: A Unified Architecture for High Fan-in Systems. 1357-1360
Daniel J. Abadi, Wolfgang Lindner, Samuel Madden, Jörg Schuler: An Integration Framework for Sensor Networks and Data Stream Management Systems. 1361-1364
Sven Schmidt, Henrike Berthold, Wolfgang Lehner: QStream: Deterministic Querying of Data Streams. 1365-1368
Mehdi Sharifzadeh, Cyrus Shahabi, Bahareh Navai, Farid Parvini, Albert A. Rizzo: AIDA: an Adaptive Immersive Data Analyzer. 1369-1372
Özgür Ulusoy, Ugur Güdükbay, Mehmet Emin Dönderler, Ediz Saykol, Cemil Alper: BilVideo Video Database Management System. 1373-1376
Mohamed F. Mokbel, Xiaopeng Xiong, Walid G. Aref, Susanne E. Hambrusch, Sunil Prabhakar, Moustafa A. Hammad: PLACE: A Query Processor for Handling Real-time Spatio-temporal Data Streams. 1377-1380



