13. KDD 2007: San Jose, California, USA
Pavel Berkhin, Rich Caruana, Xindong Wu (Eds.): Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Jose, California, USA, August 12-15, 2007. ACM 2007 ISBN 978-1-59593-609-7
Chris Anderson: Calculating latent demand in the long tail. 1
Usama M. Fayyad: From mining the web to inventing the new sciences underlying the internet. 2-3
Jon M. Kleinberg: Challenges in mining social network data: processes, privacy, and paradoxes. 4-5
Research track papers
Deepak Agarwal, Dhiman Barman, Dimitrios Gunopulos, Neal E. Young, Flip Korn, Divesh Srivastava: Efficient and effective explanation of change in hierarchical summaries. 6-15
Deepak Agarwal, Andrei Z. Broder, Deepayan Chakrabarti, Dejan Diklic, Vanja Josifovski, Mayssam Sayyadian: Estimating rates of rare events at multiple resolutions. 16-25
Deepak Agarwal, Srujana Merugu: Predictive discrete latent factor models for large scale dyadic data. 26-35
Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua Feng, Mohammed Javeed Zaki: Xproj: a framework for projected structural clustering of xml documents. 46-55
Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis: Show me the money!: deriving the pricing power of product features by mining consumer reviews. 56-65


Robert M. Bell, Yehuda Koren, Chris Volinsky: Modeling relationships at multiple scales to improve accuracy of large recommender systems. 95-104
Deepavali Bhagwat, Kave Eshghi, Pankaj Mehra: Content-based document routing and index partitioning for scalable similarity-based searches in a large corpus. 105-112
Wanpracha Art Chaovalitwongse, Ya-Ju Fan, Rajesh C. Sachdeo: Support feature machine for classification of abnormal brain activity. 113-122
Jianhui Chen, Zheng Zhao, Jieping Ye, Huan Liu: Nonlinear adaptive distance metric learning for clustering. 123-132
Peter A. Chew, Brett W. Bader, Tamara G. Kolda, Ahmed Abdelali: Cross-language information retrieval using PARAFAC2. 143-152
Yun Chi, Xiaodan Song, Dengyong Zhou, Koji Hino, Belle L. Tseng: Evolutionary spectral clustering by incorporating temporal smoothness. 153-162
Yun Chi, Shenghuo Zhu, Xiaodan Song, Jun'ichi Tatemura, Belle L. Tseng: Structural and temporal analysis of the blogosphere through community factorization. 163-172
Sumit Chopra, Trivikraman Thampy, John Leahy, Andrew Caplin, Yann LeCun: Discovering the hidden structure of house prices with a non-parametric latent manifold model. 173-182
Daniel Crabtree, Peter Andreae, Xiaoying Gao: Exploiting underrepresented query aspects for automatic query expansion. 191-200
Aron Culotta, Michael L. Wick, Robert Hall, Matthew Marzilli, Andrew McCallum: Canonicalization of database records using adaptive similarity measures. 201-209
Wenyuan Dai, Gui-Rong Xue, Qiang Yang, Yong Yu: Co-clustering based classification for out-of-domain documents. 210-219
Anirban Dasgupta, Petros Drineas, Boulos Harb, Vanja Josifovski, Michael W. Mahoney: Feature selection methods for text classification. 230-239
Meghana Deodhar, Joydeep Ghosh: A framework for simultaneous co-clustering and learning from complex data. 250-259
Chris H. Q. Ding, Rong Jin, Tao Li, Horst D. Simon: A learning framework using Green's function and kernel regularization with application to recommender system. 260-269
Dejing Dou, Gwen A. Frishkoff, Jiawei Rong, Robert M. Frank, Allen D. Malony, Don M. Tucker: Development of NeuroElectroMagnetic ontologies(NEMO): a framework for mining brainwave ontologies. 270-279
Gregory Druck, Chris Pal, Andrew McCallum, Xiaojin Zhu: Semi-supervised classification with hybrid generative/discriminative methods. 280-289
Lisa Friedland, David Jensen: Finding tribes: identifying close-knit individuals from employment patterns. 290-299
Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Huan Liu, Philip S. Yu: Time-dependent event hierarchy construction. 300-309
Byron J. Gao, Martin Ester, Jin-yi Cai, Oliver Schulte, Hui Xiong: The minimum consistent subset cover problem and its applications in data mining. 310-319

Zhen Guo, Zhongfei Zhang, Eric P. Xing, Christos Faloutsos: Enhanced max margin learning on multimodal data mining in a multimedia database. 340-349
Hannes Heikinheimo, Jouni K. Seppänen, Eino Hinkkanen, Heikki Mannila, Taneli Mielikäinen: Finding low-entropy sets and trees from binary data. 350-359
Frizo A. L. Janssens, Wolfgang Glänzel, Bart De Moor: Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis. 360-369
Yookyung Jo, Carl Lagoze, C. Lee Giles: Detecting research topics via the correlation between graphs and texts. 370-379
Panagiotis Karras, Dimitris Sacharidis, Nikos Mamoulis: Exploiting duality in summarization with deterministic guarantees. 380-389

Srivatsan Laxman, P. S. Sastry, K. P. Unnikrishnan: A fast algorithm for finding frequent episodes in event streams. 410-419
Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne M. VanBriesen, Natalie S. Glance: Cost-effective outbreak detection in networks. 420-429
Jinyan Li, Guimei Liu, Limsoon Wong: Mining statistically important equivalence classes and delta-discriminative emerging patterns. 430-439
Ping Li: Very sparse stable random projections for dimension reduction in lalpha (0 <alpha<=2) norm. 440-449
David Lo, Siau-Cheng Khoo, Chao Liu: Efficient mining of iterative patterns for software specification discovery. 460-469
Bo Long, Zhongfei (Mark) Zhang, Philip S. Yu: A probabilistic framework for relational clustering. 470-479


Flavia Moser, Rong Ge, Martin Ester: Joint cluster analysis of attribute and relationship data withouta-priori specification of the number of clusters. 510-519

Gaurav Pandey, Michael Steinbach, Rohit Gupta, Tushar Garg, Vipin Kumar: Association analysis-based transformations for protein interaction networks: a function prediction case study. 540-549
Seung-Taek Park, David M. Pennock: Applying collaborative filtering techniques to movie search for better ranking and browsing. 550-559
Raymond K. Pon, Alfonso F. Cardenas, David Buttler, Terence Critchlow: Tracking multiple topics for finding interesting articles. 560-569
Filip Radlinski, Thorsten Joachims: Active exploration for learning rankings from clickthrough data. 570-579
Mark Sandler: Hierarchical mixture models: a probabilistic analysis. 580-589
Issei Sato, Hiroshi Nakagawa: Knowledge discovery of multiple-topic document using parametric mixture model with dirichlet prior. 590-598
Vincent Schickel-Zuber, Boi Faltings: Using hierarchical clustering for learning theontologies used in recommendation systems. 599-608
D. Sculley: Practical learning from one-sided feedback. 609-618
Benyah Shaparenko, Thorsten Joachims: Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases. 619-628
Shady Shehata, Fakhri Karray, Mohamed Kamel: A concept-based model for enhancing text categorization. 629-637
Motoki Shiga, Ichigaku Takigawa, Hiroshi Mamitsuka: A spectral clustering approach to optimally combining numericalvectors with a modular network. 647-656
Xiuyao Song, Mingxi Wu, Christopher M. Jermaine, Sanjay Ranka: Statistical change detection for multi-dimensional data. 667-676
Rohini K. Srihari, Li Xu, Tushar Saxena: Use of ranked cross document evidence trails for hypothesis generation. 677-686
Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, Philip S. Yu: GraphScope: parameter-free mining of large time-evolving graphs. 687-696
Gaurav Tandon, Philip K. Chan: Weighting versus pruning in rule validation for detecting network and host anomalies. 697-706
Chayant Tantipathananandh, Tanya Y. Berger-Wolf, David Kempe: A framework for community identification in dynamic social networks. 717-726
Choon Hui Teo, Alex J. Smola, S. V. N. Vishwanathan, Quoc V. Le: A scalable modular convex solver for regularized risk minimization. 727-736
Hanghang Tong, Christos Faloutsos, Brian Gallagher, Tina Eliassi-Rad: Fast best-effort pattern matching in large attributed graphs. 737-746
Hanghang Tong, Christos Faloutsos, Yehuda Koren: Fast direction-aware proximity for graph mining. 747-756
David S. Vogel, Ognian Asparouhov, Tobias Scheffer: Scalable look-ahead linear regression trees. 757-764
Li Wan, Wee Keong Ng, Shuguo Han, Vincent C. S. Lee: Privacy-preservation for gradient descent methods. 775-783
Xuanhui Wang, ChengXiang Zhai, Xiao Hu, Richard Sproat: Mining correlated bursty topic patterns from coordinated text streams. 784-793
Xuerui Wang, Chris Pal, Andrew McCallum: Generalized component analysis for text with heterogeneous attributes. 794-803

Xiaowei Xu, Nurcan Yuruk, Zhidan Feng, Thomas A. J. Schweiger: SCAN: a structural clustering algorithm for networks. 824-833
Rong Yan, Jelena Tesic, John R. Smith: Model-shared subspace boosting for multi-label classification. 834-843
Dragomir Yankov, Eamonn J. Keogh, Jose Medina, Bill Yuan-chi Chiu, Victor B. Zordan: Detecting time series motifs under uniform scaling. 844-853
Jieping Ye, Shuiwang Ji, Jianhui Chen: Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming. 854-863
Junsong Yuan, Ying Wu, Ming Yang: From frequent itemsets to semantically meaningful visual patterns. 864-873
Xian Zhang, Yu Hao, Xiaoyan Zhu, Ming Li, David R. Cheriton: Information distance from a question to an answer. 874-883
Hongkun Zhao, Weiyi Meng, Clement T. Yu: Mining templates from search result records of search engines. 884-893
Shuyi Zheng, Ruihua Song, Ji-Rong Wen, Di Wu: Joint optimization of wrapper generation and template detection. 894-902
Jun Zhu, Bo Zhang, Zaiqing Nie, Ji-Rong Wen, Hsiao-Wuen Hon: Webpage understanding: an integrated approach. 903-912
Industrial and government track papers
Sitaram Asur, Srinivasan Parthasarathy, Duygu Ucar: An event-based framework for characterizing the evolutionary behavior of interaction graphs. 913-921
Rebecca Castaño, Kiri Wagstaff, Steve A. Chien, Timothy M. Stough, Benyang Tang: On-board analysis of uncalibrated data for a spacecraft at mars. 922-930
Andrew S. Fast, Lisa Friedland, Marc E. Maier, Brian J. Taylor, David Jensen, Henry G. Goldberg, John Komoroske: Relational data pre-processing techniques for improved securities fraud detection. 941-949
Ron Kohavi, Randal M. Henne, Dan Sommerfield: Practical guide to controlled experiments on the web: listen to your customers not to the hippo. 959-967
Ping Luo, Hui Xiong, Kevin Lü, Zhongzhi Shi: Distributed classification in peer-to-peer networks. 968-976
Claudia Perlich, Saharon Rosset, Richard D. Lawrence, Bianca Zadrozny: High-quantile modeling for customer wallet estimation and other applications. 977-985
Jun Hua Zhao, Zhao Yang Dong, Pei Zhang: Mining complex power networks for blackout prevention. 986-994
Guangyu Zhu, Timothy J. Bethea, Vikas Krishna: Extracting relevant named entities for automated expense reimbursement. 1004-1012
Industrial and government track short papers
Charu C. Aggarwal: A framework for classification and segmentation of massive audio data streams. 1013-1017
Chris Curry, Robert L. Grossman, David Locke, Steve Vejcik, Joseph Bugajski: Detecting changes in large data sets of payment card data: a case study. 1018-1022
Rong Pan, Junhui Zhao, Vincent Wenchen Zheng, Jeffrey Junfeng Pan, Dou Shen, Sinno Jialin Pan, Qiang Yang: Domain-constrained semi-supervised mining of tracking models in sensor networks. 1023-1027
R. Bharat Rao, Jinbo Bi, Glenn Fung, Marcos Salganicoff, Nancy Obuchowski, David P. Naidich: LungCAD: a clinically approved, machine learning system for lung cancer detection. 1033-1037

Xiaoxin Yin, Jiawei Han, Philip S. Yu: Truth discovery with multiple conflicting information providers on the web. 1048-1052
Panel
Srinivasan Parthasarathy: Data mining at the crossroads: successes, failures and learning from them. 1053-1055



