11. KDD 2005: Chicago, Illinois, USA
Robert Grossman, Roberto J. Bayardo, Kristin P. Bennett (Eds.): Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, USA, August 21-24, 2005. ACM 2005 ISBN 1-59593-135-X
Invited Talks
Prabhakar Raghavan: Incentive networks. 1
Gian Fulgoni: Mining the internet: the eighth wonder of the world. 2
Albert-László Barabási: The architecture of complexity: the structure and the dynamics of networks, from the web to the cell. 3
Research Track Papers
Rong Chen, Edward Herskovits: A Bayesian network classifier with inverse tree structure for voxelwise magnetic resonance image analysis. 4-12
Anirban Dasgupta, Ravi Kumar, Prabhakar Raghavan, Andrew Tomkins: Variable latent semantic indexing. 13-21
Jianping Fan, Hangzai Luo, Mohand-Said Hacid: Mining images on semantics via statistical learning. 22-31
Glenn Fung, Sathyakama Sandilya, R. Bharat Rao: Rule extraction from linear support vector machines. 32-40
Bin Gao, Tie-Yan Liu, Xin Zheng, QianSheng Cheng, Wei-Ying Ma: Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering. 41-50
Aristides Gionis, Alexander Hinneburg, Spiros Papadimitriou, Panayiotis Tsaparas: Dimension induced clustering. 51-60

Daniel Gruhl, Ramanathan V. Guha, Ravi Kumar, Jasmine Novak, Andrew Tomkins: The predictive power of online chatter. 78-87

Aleks Jakulin, Martin Mozina, Janez Demsar, Ivan Bratko, Blaz Zupan: Nomograms for visualizing support vector machines. 108-117
Szymon Jaroszewicz, Tobias Scheffer: Fast discovery of unexpected patterns in data, relative to a Bayesian network. 118-127
Aleksander Kolcz: Local sparsity control for naive Bayes with extreme misclassification costs. 128-137
Jeremy Kubica, Andrew W. Moore, Andrew Connolly, Robert Jedicke: A multiple tree algorithm for the efficient association of asteroid observations. 138-146

Gregor Leban, Minca Mramor, Ivan Bratko, Blaz Zupan: Simple and effective visual models for gene expression cancer diagnostics. 167-176
Jure Leskovec, Jon M. Kleinberg, Christos Faloutsos: Graphs over time: densification laws, shrinking diameters and possible explanations. 177-187
Tao Li: A general model for clustering binary data. 188-197
Qiaozhu Mei, ChengXiang Zhai: Discovering evolutionary theme patterns from text: an exploration of temporal text mining. 198-207
Srujana Merugu, Joydeep Ghosh: A distributed learning framework for heterogeneous data sources. 208-217
Daniel B. Neill, Andrew W. Moore, Maheshkumar Sabhnani, Kenny Daniel: Detection of emerging space-time clusters. 218-227

Saharon Rosset: Robust boosting and its relation to bagging. 249-255
Mark Sandler: On the use of linear programming for unsupervised text classification. 256-264
Martin Scholz: Sampling-based sequential subgroup mining. 265-274
Antti Ukkonen, Mikael Fortelius, Heikki Mannila: Finding partial orders from unordered 0-1 data. 285-293
Muyuan Wang, Zhiwei Li, Lie Lu, Wei-Ying Ma, Naiyao Zhang: Web object indexing using domain knowledge. 294-303
Xuan Hieu Phan, Minh Le Nguyen, Tu Bao Ho, Susumu Horiguchi: Improving discriminative sequential learning with rare--but--important associations. 304-313
Xifeng Yan, Hong Cheng, Jiawei Han, Dong Xin: Summarizing itemset patterns: a profile-based approach. 314-323
Xifeng Yan, Xianghong Jasmine Zhou, Jiawei Han: Mining closed relational graphs with connectivity constraints. 324-333

Hwanjo Yu: SVM selective sampling for ranking with application to data retrieval. 354-363
Nan Zhang, Shengquan Wang, Wei Zhao: A new scheme on privacy-preserving data classification. 374-383
Jing Zhou, Dean P. Foster, Robert A. Stine, Lyle H. Ungar: Streaming feature selection using alpha-investing. 384-393
Industry/Government Track Papers
George Forman, Kave Eshghi, Stephane Chiocchetti: Finding similar files in large document repositories. 394-400
Ryohei Fujimaki, Takehisa Yairi, Kazuo Machida: An approach to spacecraft anomaly detection problem using kernel feature space. 401-410
Rayid Ghani: Price prediction and insurance for online auctions. 411-418
Natalie S. Glance, Matthew Hurst, Kamal Nigam, Matthew Siegler, Robert Stockton, Takashi Tomokiyo: Deriving marketing intelligence from online discussion. 419-428
Bin He, Kevin Chen-Chuan Chang: Making holistic schema matching robust: an ensemble approach. 429-438
Olfa Nasraoui, Cesar Cardona, Carlos Rojas: Using retrieval measures to assess similarity in mining dynamic web clickstreams. 439-448
Jennifer Neville, Özgür Simsek, David Jensen, John Komoroske, Kelly Palmer, Henry G. Goldberg: Using relational knowledge discovery to prevent securities fraud. 449-458
G. Niklas Norén, Roland Orre, Andrew Bate: A hit-miss model for duplicate detection in the WHO drug safety database. 459-468
Bhavani Raskutti, Alan Herschtal: Predicting the product purchase patterns of corporate customers. 469-478
Xiaodan Song, Ching-Yung Lin, Belle L. Tseng, Ming-Ting Sun: Modeling and predicting personal information dissemination behavior. 479-488

Lian Yan, Michael Fassino, Patrick Baldasare: Enhancing the lift under budget constraints: an application in the mutual fund industry. 509-515
Research Track Posters
Charu C. Aggarwal: Towards exploratory test instance specific algorithms for high dimensional classification. 526-531
Arindam Banerjee, Chase Krumpelman, Joydeep Ghosh, Sugato Basu, Raymond J. Mooney: Model-based overlapping clustering. 532-537
Christopher Besemann, Anne Denton: Integration of profile hidden Markov model output into association rule mining. 538-543
Giuseppe Carenini, Raymond T. Ng, Xiaodong Zhou: Scalable discovery of hidden emails from large folders. 544-549
Chien Chin Chen, Meng Chang Chen, Ming-Syan Chen: LIPED: HMM-based life profiles for adaptive event detection. 556-561
Andrew S. Fast, David Jensen, Brian Neil Levine: Creating social networks to improve peer-to-peer networking. 568-573

Takahiko Ito, Masashi Shimbo, Taku Kudo, Yuji Matsumoto: Application of kernels to link analysis. 586-592
Geetha Jagannathan, Rebecca N. Wright: Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. 593-599
Ruoming Jin, Kaushik Sinha, Gagan Agrawal: Simultaneous optimization of complex mining tasks with a knowledgeable cache. 600-605
Ruoming Jin, Chao Wang, Dmitrii Polshakov, Srinivasan Parthasarathy, Gagan Agrawal: Discovering frequent topological structures from graph datasets. 606-611
Xin Jin, Yanzan Zhou, Bamshad Mobasher: A maximum entropy web recommendation system: combining collaborative and content features. 612-617
Noriaki Kawamae, Katsumi Takahashi: Information retrieval based on collaborative filtering with latent interest semantic map. 618-623
Moshe Koppel, Jonathan Schler, Kfir Zigdon: Determining an author's native language by mining a text for errors. 624-628
Inderjit S. Dhillon, Yuqiang Guan, Brian Kulis: A fast kernel-based multilevel algorithm for graph clustering. 629-634

Sandeep Mane, Jaideep Srivastava, San-Yih Hwang: Estimating missed actual positives using independent classifiers. 648-653
Michinari Momma: Efficient computations via scalable sparse kernel partial least squares and boosted latent features. 654-659
Fabian Mörchen, Alfred Ultsch: Optimizing time series discretization for knowledge discovery. 660-665
Satoshi Morinaga, Hiroki Arimura, Takahiro Ikeda, Yosuke Sakao, Susumu Akamine: Key semantics extraction by dependency tree mining. 666-671
Ellen Spertus, Mehran Sahami, Orkut Buyukkokten: Evaluating similarity measures: a large-scale study in the orkut social network. 678-684
Mihai Surdeanu, Jordi Turmo, Alicia Ageno: A hybrid unsupervised approach for document clustering. 685-690
Tao Tao, ChengXiang Zhai: Mining comparable bilingual text corpora for cross-language information integration. 691-696
Luís Torgo: Regression error characteristic surfaces. 697-702
Gang Wu, Edward Y. Chang, Navneet Panda: Formulating distance functions via the kernel trick. 703-709
Ying Yang, Xindong Wu, Xingquan Zhu: Combining proactive and reactive predictions for data streams. 710-715
Hui Yang, Srinivasan Parthasarathy, Sameep Mehta: A generalized framework for mining spatio-temporal patterns in scientific data. 716-721
Li Yang: Building connected neighborhood graphs for isometric data embedding. 722-728
Mohammed Javeed Zaki, Markus Peters, Ira Assent, Thomas Seidl: CLICKS: an effective algorithm for mining subspace clusters in categorical datasets. 736-742
Richard Cole, Dennis Shasha, Xiaojian Zhao: Fast window correlations over uncooperative time series. 743-749
Industry/Government Track Posters
Haifeng Chen, Guofei Jiang, Cristian Ungureanu, Kenji Yoshihira: Failure detection and localization in component based systems by online tracking. 750-755
Daniel R. Jeske, Behrokh Samadi, Pengyue J. Lin, Lan Ye, Sean Cox, Rui Xiao, Ted Younglove, Minh Ly, Douglas Holt, Ryan Rich: Generation of synthetic data sets for evaluating the accuracy of knowledge discovery systems. 756-762
Jiuyong Li, Ada Wai-Chee Fu, Hongxing He, Jie Chen, Huidong Jin, Damien McAullay, Graham J. Williams, Ross Sparks, Chris Kelman: Mining risk patterns in medical data. 770-775
Tao Li, Feng Liang, Sheng Ma, Wei Peng: An integrated framework on mining logs files for computing system management. 776-781
Xiang Li, Rahul Ramachandran, Sara J. Graves, Sunil Movva, Bilahari Akkiraju, David Emmitt, Steven Greco, Robert Atlas, Joseph Terry, Juan-Carlos Jusem: Automated detection of frontal systems from numerical model-generated data. 782-787
Ronald K. Pearson, Robert J. Kingan, Alan Hochberg: Disease progression modeling from historical clinical databases. 788-793
Valery A. Petrushin: Mining rare and frequent events in multi-camera surveillance video using self-organizing maps. 794-800
Rob Powers, Moisés Goldszmidt, Ira Cohen: Short term performance forecasting in enterprise systems. 801-807
Kaushal Sanghai, Ting Su, Jennifer G. Dy, David R. Kaeli: A multinomial clustering model for fast simulation of computer architecture designs. 808-813



