Beyond Market Baskets: Generalizing Association Rules to Correlations.
Sergey Brin, Rajeev Motwani, Craig Silverstein:
Beyond Market Baskets: Generalizing Association Rules to Correlations.
SIGMOD Conference 1997: 265-276@inproceedings{DBLP:conf/sigmod/BrinMS97,
author = {Sergey Brin and
Rajeev Motwani and
Craig Silverstein},
editor = {Joan Peckham},
title = {Beyond Market Baskets: Generalizing Association Rules to Correlations},
booktitle = {SIGMOD 1997, Proceedings ACM SIGMOD International Conference
on Management of Data, May 13-15, 1997, Tucson, Arizona, USA},
publisher = {ACM Press},
year = {1997},
pages = {265-276},
ee = {http://doi.acm.org/10.1145/253260.253327, db/conf/sigmod/BrinMS97.html},
crossref = {DBLP:conf/sigmod/97},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
One of the most well-studied problems in data mining is mining
for association rules in market basket data. Association
rules, whose significance is measured via support and confidence,
are intended to identify rules of the type, "A customer
purchasing item A often also purchases item B." Motivated
by the goal of generalizing beyond market baskets and the
association rules used with them, we develop the notion of
mining rules that identify correlations (generalizing associations),
and we consider both the absence and presence of
items as a basis for generating rules. We propose measuring
significance of associations via the chi-squared test for
correlation from classical statistics. This leads to a measure
that is upward closed in the itemset lattice, enabling us to reduce
the mining problem to the search for a border between
correlated and uncorrelated itemsets in the lattice. We develop
pruning strategies and devise an efficient algorithm for
the resulting problem. We demonstrate its effectiveness by
testing it on census data and finding term dependence in a
corpus of text documents, as well as on synthetic data.
Copyright © 1997 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
Online Version (ACM WWW Account required): Full Text in PDF Format
CDROM Version: Load the CDROM "Volume 1 Issue 1, SIGMOD '93-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
Joan Peckham (Ed.):
SIGMOD 1997, Proceedings ACM SIGMOD International Conference on Management of Data, May 13-15, 1997, Tucson, Arizona, USA.
ACM Press 1997
,
SIGMOD Record 26(2),
June 1997
Contents
[Index Terms]
[Full Text in PDF Format, 1550 KB]
References
- [1]
- Rakesh Agrawal, Manish Mehta, John C. Shafer, Ramakrishnan Srikant, Andreas Arning, Toni Bollinger:
The Quest Data Mining System.
KDD 1996: 244-249

- [2]
- Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami:
Mining Association Rules between Sets of Items in Large Databases.
SIGMOD Conference 1993: 207-216

- [3]
- Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami:
Database Mining: A Performance Perspective.
IEEE Trans. Knowl. Data Eng. 5(6): 914-925(1993)

- [4]
- ...
- [5]
- Rakesh Agrawal, Ramakrishnan Srikant:
Fast Algorithms for Mining Association Rules in Large Databases.
VLDB 1994: 487-499

- [6]
- ...
- [7]
- Martin Dietzfelbinger, Anna R. Karlin, Kurt Mehlhorn, Friedhelm Meyer auf der Heide, Hans Rohnert, Robert Endre Tarjan:
Dynamic Perfect Hashing: Upper and Lower Bounds.
FOCS 1988: 524-531

- [8]
- ...
- [9]
- Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, Ramasamy Uthurusamy (Eds.):
Advances in Knowledge Discovery and Data Mining.
AAAI/MIT Press 1996, ISBN 0-262-56097-6
Contents

- [10]
- Michael L. Fredman, János Komlós, Endre Szemerédi:
Storing a Sparse Table with 0(1) Worst Case Access Time.
J. ACM 31(3): 538-544(1984)

- [11]
- Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama:
Mining Optimized Association Rules for Numeric Attributes.
PODS 1996: 182-191

- [12]
- Takeshi Fukuda, Yasuhiko Morimoto, Shinichi Morishita, Takeshi Tokuyama:
Data Mining Using Two-Dimensional Optimized Accociation Rules: Scheme, Algorithms, and Visualization.
SIGMOD Conference 1996: 13-23

- [13]
- Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh:
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total.
ICDE 1996: 152-159

- [14]
- Dimitrios Gunopulos, Heikki Mannila, Sanjeev Saluja:
Discovering All Most Specific Sentences by Randomized Algorithms.
ICDT 1997: 215-229

- [15]
- Jiawei Han, Yongjian Fu:
Discovery of Multiple-Level Association Rules from Large Databases.
VLDB 1995: 420-431

- [16]
- Maurice A. W. Houtsma, Arun N. Swami:
Set-Oriented Mining for Association Rules in Relational Databases.
ICDE 1995: 25-33

- [17]
- Mika Klemettinen, Heikki Mannila, Pirjo Ronkainen, Hannu Toivonen, A. Inkeri Verkamo:
Finding Interesting Rules from Large Sets of Discovered Association Rules.
CIKM 1994: 401-407

- [18]
- ...
- [19]
- ...
- [20]
- ...
- [21]
- ...
- [22]
- ...
- [23]
- ...
- [24]
- Jong Soo Park, Ming-Syan Chen, Philip S. Yu:
An Effective Hash Based Algorithm for Mining Association Rules.
SIGMOD Conference 1995: 175-186

- [25]
- ...
- [26]
- Gregory Piatetsky-Shapiro, William J. Frawley (Eds.):
Knowledge Discovery in Databases.
AAAI/MIT Press 1991, ISBN 0-262-62080-4
Contents

- [27]
- Ashok Savasere, Edward Omiecinski, Shamkant B. Navathe:
An Efficient Algorithm for Mining Association Rules in Large Databases.
VLDB 1995: 432-444

- [28]
- Ramakrishnan Srikant, Rakesh Agrawal:
Mining Generalized Association Rules.
VLDB 1995: 407-419

- [29]
- Hannu Toivonen:
Sampling Large Databases for Association Rules.
VLDB 1996: 134-145

- [30]
- Jennifer Widom:
Research Problems in Data Warehousing.
CIKM 1995: 25-30

Copyright © Thu Dec 24 17:06:27 2009
by Michael Ley (ley@uni-trier.de)