ACM SIGMOD Anthology ACM SIGMOD dblp.uni-trier.de

Beyond Market Baskets: Generalizing Association Rules to Correlations.

Sergey Brin, Rajeev Motwani, Craig Silverstein: Beyond Market Baskets: Generalizing Association Rules to Correlations. SIGMOD Conference 1997: 265-276
@inproceedings{DBLP:conf/sigmod/BrinMS97,
  author    = {Sergey Brin and
               Rajeev Motwani and
               Craig Silverstein},
  editor    = {Joan Peckham},
  title     = {Beyond Market Baskets: Generalizing Association Rules to Correlations},
  booktitle = {SIGMOD 1997, Proceedings ACM SIGMOD International Conference
               on Management of Data, May 13-15, 1997, Tucson, Arizona, USA},
  publisher = {ACM Press},
  year      = {1997},
  pages     = {265-276},
  ee        = {http://doi.acm.org/10.1145/253260.253327, db/conf/sigmod/BrinMS97.html},
  crossref  = {DBLP:conf/sigmod/97},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

One of the most well-studied problems in data mining is mining for association rules in market basket data. Association rules, whose significance is measured via support and confidence, are intended to identify rules of the type, "A customer purchasing item A often also purchases item B." Motivated by the goal of generalizing beyond market baskets and the association rules used with them, we develop the notion of mining rules that identify correlations (generalizing associations), and we consider both the absence and presence of items as a basis for generating rules. We propose measuring significance of associations via the chi-squared test for correlation from classical statistics. This leads to a measure that is upward closed in the itemset lattice, enabling us to reduce the mining problem to the search for a border between correlated and uncorrelated itemsets in the lattice. We develop pruning strategies and devise an efficient algorithm for the resulting problem. We demonstrate its effectiveness by testing it on census data and finding term dependence in a corpus of text documents, as well as on synthetic data.

Copyright © 1997 by the ACM, Inc., used by permission. Permission to make digital or hard copies is granted provided that copies are not made or distributed for profit or direct commercial advantage, and that copies show this notice on the first page or initial screen of a display along with the full citation.


ACM SIGMOD Anthology

Online Version (ACM WWW Account required): Full Text in PDF Format

CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Printed Edition

Joan Peckham (Ed.): SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13-15, 1997, Tucson, Arizona, USA. ACM Press 1997 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML, SIGMOD Record 26(2), June 1997
Contents

Online Edition: ACM Digital Library

[Index Terms]
[Full Text in PDF Format, 1550 KB]

References

[1]
Rakesh Agrawal, Manish Mehta, John C. Shafer, Ramakrishnan Srikant, Andreas Arning, Toni Bollinger: The Quest Data Mining System. KDD 1996: 244-249 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[2]
Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases. SIGMOD Conference 1993: 207-216 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[3]
Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Database Mining: A Performance Perspective. IEEE Trans. Knowl. Data Eng. 5(6): 914-925(1993) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[4]
...
[5]
Rakesh Agrawal, Ramakrishnan Srikant: Fast Algorithms for Mining Association Rules in Large Databases. VLDB 1994: 487-499 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[6]
...
[7]
Martin Dietzfelbinger, Anna R. Karlin, Kurt Mehlhorn, Friedhelm Meyer auf der Heide, Hans Rohnert, Robert Endre Tarjan: Dynamic Perfect Hashing: Upper and Lower Bounds. FOCS 1988: 524-531 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[8]
...
[9]
Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy (Eds.): Advances in Knowledge Discovery and Data Mining. AAAI/MIT Press 1996, ISBN 0-262-56097-6
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[10]
Michael L. Fredman, János Komlós, Endre Szemerédi: Storing a Sparse Table with 0(1) Worst Case Access Time. J. ACM 31(3): 538-544(1984) CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[11]
Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Mining Optimized Association Rules for Numeric Attributes. PODS 1996: 182-191 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[12]
Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama: Data Mining Using Two-Dimensional Optimized Accociation Rules: Scheme, Algorithms, and Visualization. SIGMOD Conference 1996: 13-23 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[13]
Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE 1996: 152-159 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[14]
Dimitrios Gunopulos, Heikki Mannila, Sanjeev Saluja: Discovering All Most Specific Sentences by Randomized Algorithms. ICDT 1997: 215-229 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[15]
Jiawei Han, Yongjian Fu: Discovery of Multiple-Level Association Rules from Large Databases. VLDB 1995: 420-431 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[16]
Maurice A. W. Houtsma, Arun N. Swami: Set-Oriented Mining for Association Rules in Relational Databases. ICDE 1995: 25-33 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[17]
Mika Klemettinen, Heikki Mannila, Pirjo Ronkainen, Hannu Toivonen, A. Inkeri Verkamo: Finding Interesting Rules from Large Sets of Discovered Association Rules. CIKM 1994: 401-407 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[18]
...
[19]
...
[20]
...
[21]
...
[22]
...
[23]
...
[24]
Jong Soo Park, Ming-Syan Chen, Philip S. Yu: An Effective Hash Based Algorithm for Mining Association Rules. SIGMOD Conference 1995: 175-186 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[25]
...
[26]
Gregory Piatetsky-Shapiro, William J. Frawley (Eds.): Knowledge Discovery in Databases. AAAI/MIT Press 1991, ISBN 0-262-62080-4
Contents CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[27]
Ashok Savasere, Edward Omiecinski, Shamkant B. Navathe: An Efficient Algorithm for Mining Association Rules in Large Databases. VLDB 1995: 432-444 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[28]
Ramakrishnan Srikant, Rakesh Agrawal: Mining Generalized Association Rules. VLDB 1995: 407-419 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[29]
Hannu Toivonen: Sampling Large Databases for Association Rules. VLDB 1996: 134-145 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML
[30]
Jennifer Widom: Research Problems in Data Warehousing. CIKM 1995: 25-30 CiteSeerX Google scholar pubzone.org BibTeX bibliographical record in XML

Copyright © Thu Dec 24 17:06:27 2009 by Michael Ley (ley@uni-trier.de)