12. KDD 2006: Philadelphia, PA, USA
Tina Eliassi-Rad, Lyle H. Ungar, Mark Craven, Dimitrios Gunopulos (Eds.): Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 20-23, 2006. ACM 2006 ISBN 1-59593-339-5
Conference invited talks
John A. Stankovic: Self-Organizing wireless sensor networks in action. 1
Andrew Moore: New cached-sufficient statistics algorithms for quickly answering statistical questions. 2
Rakesh Agrawal: Next frontier. 3
Research track papers
Elke Achtert, Christian Böhm, Hans-Peter Kriegel, Peer Kröger, Arthur Zimek: Deriving quantitative models for correlation clusters. 4-13
Deepak Agarwal, Andrew McGregor, Jeff M. Phillips, Suresh Venkatasubramanian, Zhengyuan Zhu: Spatial scan statistics: approximations and performance study. 24-33
Aris Anagnostopoulos, Michail Vlachos, Marios Hadjieleftheriou, Eamonn J. Keogh, Philip S. Yu: Global distance-based segmentation of trajectories. 34-43
Lars Backstrom, Daniel P. Huttenlocher, Jon M. Kleinberg, Xiangyang Lan: Group formation in large social networks: membership, growth, and evolution. 44-54
Daniel Barbará, Carlotta Domeniconi, James P. Rogers: Detecting outliers using transduction and statistical testing. 55-64
Christian Böhm, Christos Faloutsos, Jia-Yu Pan, Claudia Plant: Robust information-theoretic clustering. 65-75
Gregory Buehrer, Srinivasan Parthasarathy, Amol Ghoting: Out-of-core frequent pattern mining on a commodity PC. 86-95
Toon Calders, Bart Goethals, Szymon Jaroszewicz: Mining rank-correlated sets of numerical attributes. 96-105
Jin Chen, Wynne Hsu, Mong-Li Lee, See-Kiong Ng: NeMoFinder: dissecting genome-wide protein-protein interactions with meso-scale network motifs. 106-115
Chris H. Q. Ding, Tao Li, Wei Peng, Haesun Park: Orthogonal nonnegative matrix t-factorizations for clustering. 126-135
Wei Fan, Joe McCloskey, Philip S. Yu: A general framework for accurate and fast regression by data summarization in random decision trees. 136-146
Wei Fan, Ian Davidson: Reverse testing: an efficient framework to select amongst classifiers under sample selection bias. 147-156
George Forman: Quantifying trends accurately despite classifier error and class imbalance. 157-166
Aristides Gionis, Heikki Mannila, Taneli Mielikäinen, Panayiotis Tsaparas: Assessing data mining results via swap randomization. 167-176
Kosuke Hashimoto, Kiyoko F. Aoki-Kinoshita, Nobuhisa Ueda, Minoru Kanehisa, Hiroshi Mamitsuka: A new efficient probabilistic model for mining labeled ordered trees. 177-186
Steven C. H. Hoi, Michael R. Lyu, Edward Y. Chang: Learning the unified kernel machines for classification. 187-196
Alexander T. Ihler, Jon Hutchins, Padhraic Smyth: Adaptive event detection with time-varying poisson processes. 207-216
Thorsten Joachims: Training linear SVMs in linear time. 217-226
Yiping Ke, James Cheng, Wilfred Ng: Mining quantitative correlated patterns using an information-theoretic approach. 227-236
Arno J. Knobbe, Eric K. Y. Ho: Maximally informative k-itemsets and their efficient discovery. 237-244
Yehuda Koren, Stephen C. North, Chris Volinsky: Measuring and extracting proximity in networks. 245-255
Longin Jan Latecki, Marc Sobel, Rolf Lakämper: New EM derived from Kullback-Leibler divergence. 267-276

Bing Liu, Kaidi Zhao, Jeffrey Benkler, Weimin Xiao: Rule interestingness analysis using OLAP operations. 297-306
Elsa Loekito, James Bailey: Fast mining of high dimensional expressive contrast patterns using zero-suppressed binary decision diagrams. 307-316
Bo Long, Xiaoyun Wu, Zhongfei (Mark) Zhang, Philip S. Yu: Unsupervised learning on k-partite graphs. 317-326
Michael W. Mahoney, Mauro Maggioni, Petros Drineas: Tensor-CUR decompositions for tensor-based data. 327-336
Qiaozhu Mei, Dong Xin, Hong Cheng, Jiawei Han, ChengXiang Zhai: Generating semantic annotations for frequent patterns with context analysis. 337-346
Matthew J. Rattigan, Marc E. Maier, David Jensen: Using structure indices for efficient approximation of network properties. 357-366
Jimeng Sun, Dacheng Tao, Christos Faloutsos: Beyond streams and graphs: dynamic tensor analysis. 374-383
Lei Tang, Jianping Zhang, Huan Liu: Acclimatizing taxonomic semantics for hierarchical content classification from semantics to data-driven taxonomy. 384-393
Yufei Tao, Xiaokui Xiao, Shuigeng Zhou: Mining distance-based outliers from large databases in any metric space. 394-403
Hanghang Tong, Christos Faloutsos: Center-piece subgraphs: problem definition and fast solutions. 404-413
Xuerui Wang, Andrew McCallum: Topics over time: a non-Markov continuous-time model of topical trends. 424-433
Geoffrey I. Webb: Discovering significant rules. 434-443
Jieping Ye, Tie Wang: Regularized discriminant analysis for high dimensional, low sample size data. 454-463
Shipeng Yu, Kai Yu, Volker Tresp, Hans-Peter Kriegel, Mingrui Wu: Supervised probabilistic principal component analysis. 464-473
Qiankun Zhao, Tie-Yan Liu, Sourav S. Bhowmick, Wei-Ying Ma: Event detection from evolution of click-through data. 484-493
Jun Zhu, Zaiqing Nie, Ji-Rong Wen, Bo Zhang, Wei-Ying Ma: Simultaneous record detection and attribute labeling in web data extraction. 494-503
Research track posters

Charu C. Aggarwal, Jian Pei, Bo Zhang: On privacy preservation against adversarial data mining. 510-516
Bavani Arunasalam, Sanjay Chawla: CCCS: a top-down associative classifier for imbalanced class distribution. 517-522


Robin D. Burke, Bamshad Mobasher, Chad Williams, Runa Bhaumik: Classification features for attack detection in collaborative recommender systems. 542-547
Vitor R. Carvalho, William W. Cohen: Single-pass online learning: performance, voting schemes and online feature selection. 548-553
Aristides Gionis, Heikki Mannila, Kai Puolamäki, Antti Ukkonen: Algorithms for discovering bucket orders from data. 561-566
Hongyu Guo, Herna L. Viktor: Mining relational data through correlation-based multiple view validation. 567-573
Tomoharu Iwata, Kazumi Saito, Takeshi Yamada: Recommendation method for extending subscription periods. 574-579
Wolfgang Jank, Galit Shmueli, Shanshan Wang: Dynamic, real-time forecasting of online auctions via functional models. 580-585
Szymon Jaroszewicz: Polynomial association rules with applications to logistic regression. 586-591

Deept Kumar, Naren Ramakrishnan, Richard F. Helm, Malcolm Potts: Algorithms for storytelling. 604-610
Ravi Kumar, Jasmine Novak, Andrew Tomkins: Structure and evolution of online social networks. 611-617
Sven Laur, Helger Lipmaa, Taneli Mielikäinen: Cryptographically private support vector machines. 618-624
Hady Wirawan Lauw, Ee-Peng Lim, Ke Wang: Bias and controversy: beyond the statistical deviation. 625-630
Jinze Liu, Qi Zhang, Wei Wang, Leonard McMillan, Jan Prins: Clustering pair-wise dissimilarity data into partially ordered sets. 637-642
Dharmesh M. Maniyar, Ian T. Nabney: Visual data mining using principled projection algorithms and information visualization techniques. 643-648
Srujana Merugu, Saharon Rosset, Claudia Perlich: A new multi-view regression approach with an application to customer wallet estimation. 656-661
Riadh Ben Messaoud, Omar Boussaid, Sabine Loudcher Rabaséda: Efficient multidimensional data representations based on multiple correspondence analysis. 662-667
Fabian Mörchen: Algorithms for time series knowledge mining. 668-673
J. Saketha Nath, Chiranjib Bhattacharyya, M. Narasimha Murty: Clustering based large margin classification: a scalable approach using SOCP formulation. 674-679
Noam Palatin, Arie Leizarowitz, Assaf Schuster, Ran Wolff: Mining for misconfigured machines in grid systems. 687-692
Jia-Yu Pan, André G. R. Balan, Eric P. Xing, Agma J. M. Traina, Christos Faloutsos: Automatic mining of fruit fly embryo images. 693-698
Seung-Taek Park, David Pennock, Omid Madani, Nathan Good, Dennis DeCoste: Naïve filterbots for robust cold-start recommendations. 699-705
Myra Spiliopoulou, Irene Ntoutsi, Yannis Theodoridis, Rene Schult: MONIC: modeling and monitoring cluster transitions. 706-711
Fabian M. Suchanek, Georgiana Ifrim, Gerhard Weikum: Combining linguistic and statistical analysis to extract relations from web documents. 712-717
Bin Tan, Xuehua Shen, ChengXiang Zhai: Mining long-term search history to improve search accuracy. 718-723
Ivor W. Tsang, András Kocsor, James T. Kwok: Efficient kernel feature extraction for massive data sets. 724-729
Chao Wang, Srinivasan Parthasarathy: Summarizing itemset patterns using probabilistic models. 730-735
Haixun Wang, Jian Yin, Jian Pei, Philip S. Yu, Jeffrey Xu Yu: Suppressing model overfitting in mining concept-drifting data streams. 736-741
Steve Wedig, Omid Madani: A large-scale analysis of query logs for assessing personalization opportunities. 742-747
Raymond Chi-Wing Wong, Jiuyong Li, Ada Wai-Chee Fu, Ke Wang: (alpha, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing. 754-759
Gang Wu, Edward Y. Chang, Yen-Kuang Chen, Christopher J. Hughes: Incremental approximate matrix factorization for speeding up support vector machines. 760-766
Dong Xin, Xuehua Shen, Qiaozhu Mei, Jiawei Han: Discovering interesting patterns through user's interactive feedback. 773-778
Jian Xu, Wei Wang, Jian Pei, Xiaoyuan Wang, Baile Shi, Ada Wai-Chee Fu: Utility-based anonymization using local recoding. 785-790
Illhoi Yoo, Xiaohua Hu, Il-Yeol Song: Integration of semantic-based bipartite graph representation and mutual refinement strategy for biomedical literature clustering. 791-796
Zhiping Zeng, Jianyong Wang, Lizhu Zhou, George Karypis: Coherent closed quasi-clique discovery from large dense graph databases. 797-802
Sheng Zhang, Amit Chakrabarti, James Ford, Fillia Makedon: Attack detection in time series for recommender systems. 809-814
Shichao Zhang, Feng Chen, Xindong Wu, Chengqi Zhang: Identifying bridging rules between conceptual clusters. 815-820
Tong Zhang, Alexandrin Popescul, Byron Dom: Linear prediction models with graph regularization for web-page categorization. 821-826
Lizhuang Zhao, Mohammed J. Zaki, Naren Ramakrishnan: BLOSOM: a framework for mining arbitrary boolean expressions. 827-832
Industrial and government applications track invited talks
Jeff Jonas: Introducing perpetual analytics. 833
William Kahn: Capital One's statistical problems: our top ten list. 834
Andrew McCallum: Information extraction, data mining and joint inference. 835
Michael Cavaretta: Data mining challenges in the automotive domain. 836
Industrial and government applications track papers
Jinbo Bi, Senthil Periaswamy, Kazunori Okada, Toshiro Kubota, Glenn Fung, Marcos Salganicoff, R. Bharat Rao: Computer aided detection via asymmetric cascade of sparse hyperplane classifiers. 837-844
Rebecca Castaño, Dominic Mazzoni, Nghia Tang, Ronald Greeley, Thomas Doggett, Benjamin Cichy, Steve A. Chien, Ashley Davies: Onboard classifiers for science event detection on a remote sensing spacecraft. 845-851
George Forman, Evan Kirshenbaum, Jaap Suermondt: Pragmatic text mining: minimizing human effort to quantify many issues in call logs. 852-861
Seth Hettich, Michael J. Pazzani: Mining for proposal reviewers: lessons learned at the national science foundation. 862-871
Chao Liu, Chen Chen, Jiawei Han, Philip S. Yu: GPLAG: detection of software plagiarism by program dependence graph analysis. 872-881
Fabian Mörchen, Ingo Mierswa, Alfred Ultsch: Understandable models Of music collections based on exhaustive feature generation with temporal statistics. 882-891
Kaidi Zhao, Bing Liu, Jeffrey Benkler, Weimin Xiao: Opportunity map: identifying causes of failure - a deployed data mining system. 892-901
Industrial and government applications track posters
Eugene Agichtein, Zijian Zheng: Identifying "best bet" web search results by mining past user behavior. 902-908
Rich Caruana, Mohamed Farid Elhawary, Art Munson, Mirek Riedewald, Daria Sorokina, Daniel Fink, Wesley M. Hochachka, Steve Kelling: Mining citizen science data to predict orevalence of wild bird species. 909-915
Julien Etienne, Bernd Wachmann, Lei Zhang: A component-based framework for knowledge discovery in bioinformatics. 916-921
Byron J. Gao, Obi L. Griffith, Martin Ester, Steven J. M. Jones: Discovering significant OPSM subspace clusters in massive gene expression data. 922-928
Charles X. Ling, Victor S. Sheng, Tilmann F. W. Bruckhaus, Nazim H. Madhavji: Maximum profit mining and its application in software development. 929-934
Ingo Mierswa, Michael Wurst, Ralf Klinkenberg, Martin Scholz, Timm Euler: YALE: rapid prototyping for complex data mining tasks. 935-940
Sankar Virdhagriswaran, Gordon Dakin: Camouflaged fraud detection in domains with complex relationships. 941-947
Lian Yan, Patrick Baldasare: Beyond classification and ranking: constrained optimization of the ROI. 948-953
Panel
Gregory Piatetsky-Shapiro, Robert Grossman, Chabane Djeraba, Ronen Feldman, Lise Getoor, Mohammed Javeed Zaki: Is there a grand challenge or X-prize for data mining? 954-956



