15. KDD 2009:
Paris, France
John F. Elder IV, Françoise Fogelman-Soulié, Peter A. Flach, Mohammed Javeed Zaki (Eds.):
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28 - July 1, 2009.
ACM 2009, ISBN 978-1-60558-495-9
Keynote talks
- David J. Hand:
Mismatched models, wrong results, and dreadful decisions: on choosing appropriate data mining tools.
1-2

- Ravi Kumar:
Mining web logs: applications and challenges.
3-4

- Heikki Mannila:
Randomization methods in data mining.
5-6

- Ashok N. Srivastava:
Data mining at NASA: from theory to applications.
7-8

- Stanley Wasserman:
Network science: an introduction to recent statistical approaches.
9-10

Panel
Research track papers
- Deepak Agarwal, Bee-Chung Chen:
Regression-based latent factor models.
19-28

- Charu C. Aggarwal, Yan Li, Jianyong Wang, Jing Wang:
Frequent pattern mining with uncertain data.
29-38

- Amr Ahmed, Eric P. Xing, William W. Cohen, Robert F. Murphy:
Structured correspondence topic models for mining captioned figures in biological literature.
39-48

- Anurag Ambekar, Charles B. Ward, Jahangir Mohammed, Swapna Male, Steven Skiena:
Name-ethnicity classification from open sources.
49-58

- Shin Ando, Einoshin Suzuki:
Detection of unique temporal segments by information theoretic meta-clustering.
59-68

- Mafruz Zaman Ashrafi, See-Kiong Ng:
Collusion-resistant anonymous data collection method.
69-78

- Sitaram Asur, Srinivasan Parthasarathy:
A viewpoint-based approach for interaction graph analysis.
79-88

- Lars Backstrom, Jon M. Kleinberg, Ravi Kumar:
Optimizing web traffic via the media scheduling problem.
89-98

- Ron Bekkerman, Martin Scholz, Krishnamurthy Viswanathan:
Improving clustering stability with combinatorial MRFs.
99-108

- Michele Berlingerio, Fabio Pinelli, Mirco Nanni, Fosca Giannotti:
Temporal mining for interactive workflow data analysis.
109-118

- Thomas Bernecker, Hans-Peter Kriegel, Matthias Renz, Florian Verhein, Andreas Züfle:
Probabilistic frequent itemset mining in uncertain databases.
119-128

- Alina Beygelzimer, John Langford:
The offset tree for learning with partial labels.
129-138

- Albert Bifet, Geoffrey Holmes, Bernhard Pfahringer, Richard Kirkby, Ricard Gavaldà:
New ensemble methods for evolving data streams.
139-148

- Christian Böhm, Katrin Haegler, Nikola S. Müller, Claudia Plant:
CoCo: coding cost for parameter-free outlier detection.
149-158

- Yingyi Bu, Lei Chen, Ada Wai-Chee Fu, Dawei Liu:
Efficient anomaly monitoring over moving object trajectory streams.
159-168

- Jonathan Chang, Jordan L. Boyd-Graber, David M. Blei:
Connections between the lines: augmenting social networks with text.
169-178

- Bo Chen, Wai Lam, Ivor W. Tsang, Tak-Lam Wong:
Extracting discriminative concepts for domain adaptation in text mining.
179-188

- Minmin Chen, Yixin Chen, Michael R. Brent, Aaron E. Tenney:
Constrained optimization for validation-guided conditional random field learning.
189-198

- Wei Chen, Yajun Wang, Siyu Yang:
Efficient influence maximization in social networks.
199-208

- Ye Chen, Dmitry Pavlov, John F. Canny:
Large-scale behavioral targeting.
209-218

- Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Michael Mitzenmacher, Alessandro Panconesi, Prabhakar Raghavan:
On compressing social networks.
219-228

- Erick Delage:
Regret-based online ranking for a growing digital library.
229-238

- Hongbo Deng, Michael R. Lyu, Irwin King:
A generalized Co-HITS algorithm and its application to bipartite graphs.
239-248

- Meghana Deodhar, Joydeep Ghosh:
Mining for the most certain predictions from dyadic data.
249-258

- Pinar Donmez, Jaime G. Carbonell, Jeff G. Schneider:
Efficiently learning the accuracy of labeling sources for selective sampling.
259-268

- Nan Du, Christos Faloutsos, Bai Wang, Leman Akoglu:
Large human communication networks: patterns and a utility-driven generator.
269-278

- Murat Dundar, E. Daniel Hirleman, Arun K. Bhunia, J. Paul Robinson, Bartek Rajwa:
Learning with a non-exhaustive training dataset: a case study: detection of bacteria cultures using optical-scattering technology.
279-288

- Khalid El-Arini, Gaurav Veda, Dafna Shahaf, Carlos Guestrin:
Turning down the noise in the blogosphere.
289-298

- George Forman, Martin Scholz, Shyamsundar Rajaram:
Feature shaping for linear SVM classifiers.
299-308

- Richard Frank, Martin Ester, Arno J. Knobbe:
A multi-relational approach to spatial classification.
309-318

- Antonino Freno, Edmondo Trentin, Marco Gori:
Scalable pseudo-likelihood estimation in hybrid random fields.
319-328

- João Gama, Raquel Sebastião, Pedro Pereira Rodrigues:
Issues in evaluation of stream learning algorithms.
329-338

- Jing Gao, Wei Fan, Yizhou Sun, Jiawei Han:
Heterogeneous source consensus learning via decision propagation and negotiation.
339-348

- Yong Ge, Hui Xiong, Wenjun Zhou, Ramendra K. Sahoo, Xiaofeng Gao, Weili Wu:
Multi-focal learning and its application to customer service support.
349-358

- Quanquan Gu, Jie Zhou:
Co-clustering on manifolds.
359-368

- Lei Guo, Enhua Tan, Songqing Chen, Xiaodong Zhang, Yihong Eric Zhao:
Analyzing patterns of user content generation in online social networks.
369-378

- Sami Hanhijärvi, Markus Ojala, Niko Vuokko, Kai Puolamäki, Nikolaj Tatti, Heikki Mannila:
Tell me something I don't know: randomization strategies for iterative data mining.
379-388

- Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, Xiaohua Zhou:
Exploiting Wikipedia as external knowledge for document clustering.
389-396

- Mohsen Jamali, Martin Ester:
TrustWalker: a random walk model for combining trust-based and item-based recommendation.
397-406

- Shuiwang Ji, Lei Yuan, Ying-Xin Li, Zhi-Hua Zhou, Sudhir Kumar, Jieping Ye:
Drosophila gene expression pattern annotation using sparse features and term-term interactions.
407-416

- Ruoming Jin, Yang Xiang, Lin Liu:
Cartesian contour: a concise representation for a collection of frequent sets.
417-426

- Aleksander Kolcz, Gordon V. Cormack:
Genre-based decomposition of email class noise.
427-436

- Arne Koopman, Arno Siebes:
Characteristic relational patterns.
437-446

- Yehuda Koren:
Collaborative filtering with temporal dynamics.
447-456

- Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, Soumen Chakrabarti:
Collective annotation of Wikipedia entities in web text.
457-466

- Theodoros Lappas, Kun Liu, Evimaria Terzi:
Finding a team of experts in social networks.
467-476

- Theodoros Lappas, Benjamin Arai, Manolis Platakis, Dimitrios Kotsakos, Dimitrios Gunopulos:
On burstiness-aware search for document sequences.
477-486

- Mark Last:
Improving data mining utility with projective sampling.
487-496

- Jure Leskovec, Lars Backstrom, Jon M. Kleinberg:
Meme-tracking and the dynamics of the news cycle.
497-506

- Lei Li, James McCann, Nancy S. Pollard, Christos Faloutsos:
DynaMMo: mining and summarization of coevolving sequences with missing values.
507-516

- Tiancheng Li, Ninghui Li:
On the tradeoff between privacy and utility in data publishing.
517-526

- Yu-Ru Lin, Jimeng Sun, Paul Castro, Ravi B. Konuru, Hari Sundaram, Aisling Kelliher:
MetaFac: community discovery via relational hypergraph factorization.
527-536

- Chao Liu, Fan Guo, Christos Faloutsos:
BBM: bayesian browsing model from petabyte-scale data.
537-546

- Jun Liu, Jianhui Chen, Jieping Ye:
Large-scale sparse logistic regression.
547-556

- David Lo, Hong Cheng, Jiawei Han, Siau-Cheng Khoo, Chengnian Sun:
Classification of software behaviors for failure detection: a discriminative pattern mining approach.
557-566

- Steven Loscalzo, Lei Yu, Chris H. Q. Ding:
Consensus group stable feature selection.
567-576

- Aurelie C. Lozano, Naoki Abe, Yan Liu, Saharon Rosset:
Grouped graphical Granger modeling methods for temporal causal modeling.
577-586

- Aurelie C. Lozano, Hongfei Li, Alexandru Niculescu-Mizil, Yan Liu, Claudia Perlich, Jonathan R. M. Hosking, Naoki Abe:
Spatial-temporal causal modeling for climate change attribution.
587-596

- Sofus A. Macskassy:
Using graph-based metrics with empirical risk minimization to speed up active learning on networked data.
597-606

- R. Dean Malmgren, Jake M. Hofman, Luis A. Nunes Amaral, Duncan J. Watts:
Characterizing individual communication patterns.
607-616

- Andreas Maunz, Christoph Helma, Stefan Kramer:
Large-scale graph mining using backbone refinement classes.
617-626

- Frank McSherry, Ilya Mironov:
Differentially Private Recommender Systems: Building Privacy into the Netflix Prize Contenders.
627-636

- Anna Monreale, Fabio Pinelli, Roberto Trasarti, Fosca Giannotti:
WhereNext: a location predictor on trajectory pattern mining.
637-646

- Siegfried Nijssen, Tias Guns, Luc De Raedt:
Correlated itemset mining in ROC space: a constraint programming approach.
647-656

- Kensuke Onuma, Hanghang Tong, Christos Faloutsos:
TANGENT: a novel, 'Surprise me', recommendation algorithm.
657-666

- Rong Pan, Martin Scholz:
Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering.
667-676

- Gaurav Pandey, Gowtham Atluri, Michael Steinbach, Chad L. Myers, Vipin Kumar:
An association analysis approach to biclustering.
677-686

- Ardian Kristanto Poernomo, Vivekanand Gopalkrishnan:
CP-summary: a concise representation for browsing frequent itemsets.
687-696

- Ardian Kristanto Poernomo, Vivekanand Gopalkrishnan:
Towards efficient mining of proportional fault-tolerant frequent itemsets.
697-706

- Foster J. Provost, Brian Dalessandro, Rod Hook, Xiaohan Zhang, Alan Murray:
Audience selection for on-line brand advertising: privacy-friendly social network targeting.
707-716

- Zijie Qi, Ian Davidson:
A principled and flexible framework for finding alternative clusterings.
717-726

- Steffen Rendle, Leandro Balby Marinho, Alexandros Nanopoulos, Lars Schmidt-Thieme:
Learning optimal ranking with tensor factorization for tag recommendation.
727-736

- Venu Satuluri, Srinivasan Parthasarathy:
Scalable graph clustering using stochastic flows: applications to community discovery.
737-746

- Jerry Scripps, Pang-Ning Tan, Abdol-Hossein Esfahanian:
Measuring the effects of preprocessing decisions and network forces in dynamic network analysis.
747-756

- Bao-Hong Shen, Shuiwang Ji, Jieping Ye:
Mining discrete patterns via binary matrix factorization.
757-766

- Lei Shi, Vandana Pursnani Janeja:
Anomalous window discovery through scan statistics for linear intersecting paths (SSLIP).
767-776

- Xiaolin Shi, Jun Zhu, Rui Cai, Lei Zhang:
User grouping behavior in online forums.
777-786

- Takashi Shibuya, Tatsuya Harada, Yasuo Kuniyoshi:
Causality quantification and its applications: structuring and modeling of multivariate time series.
787-796

- Yizhou Sun, Yintao Yu, Jiawei Han:
Ranking-based clustering of heterogeneous information networks with star network schema.
797-806

- Jie Tang, Jimeng Sun, Chi Wang, Zi Yang:
Social influence analysis in large-scale networks.
807-816

- Lei Tang, Huan Liu:
Relational learning via latent social dimensions.
817-826

- Chayant Tantipathananandh, Tanya Y. Berger-Wolf:
Constant-factor approximation algorithms for identifying dynamic communities.
827-836

- Charalampos E. Tsourakakis, U. Kang, Gary L. Miller, Christos Faloutsos:
DOULION: counting triangles in massive graphs with a coin.
837-846

- Pavan Vatturi, Weng-Keen Wong:
Category detection using hierarchical mean shift.
847-856

- Ting Wang, Mudhakar Srivatsa, Dakshi Agrawal, Ling Liu:
Learning, indexing, and diagnosing network faults.
857-866

- Xuanhui Wang, Deepayan Chakrabarti, Kunal Punera:
Mining broad latent query aspects from search sessions.
867-876

- Junjie Wu, Hui Xiong, Jian Chen:
Adapting the right measures for K-means clustering.
877-886

- Mingxi Wu, Xiuyao Song, Chris Jermaine, Sanjay Ranka, John Gums:
A LRT framework for fast spatial anomaly detection.
887-896

- Jack Chongjie Xue, Gary M. Weiss:
Quantification and semi-supervised classification methods for handling changes in class distribution.
897-906

- Donghui Yan, Ling Huang, Michael I. Jordan:
Fast approximate spectral clustering.
907-916

- Bishan Yang, Jian-Tao Sun, Tengjiao Wang, Zheng Chen:
Effective multi-label active learning for text classification.
917-926

- Tianbao Yang, Rong Jin, Yun Chi, Shenghuo Zhu:
Combining link and content for community detection: a discriminative approach.
927-936

- Limin Yao, David M. Mimno, Andrew McCallum:
Efficient methods for topic model inference on streaming document collections.
937-946

- Lexiang Ye, Eamonn J. Keogh:
Time series shapelets: a new primitive for data mining.
947-956

- Zhijun Yin, Rui Li, Qiaozhu Mei, Jiawei Han:
Exploring social tagging graph for web object classification.
957-966

- Shinjae Yoo, Yiming Yang, Frank Lin, Il-Chul Moon:
Mining social networks for personalized email prioritization.
967-976

- Chang Hun You, Lawrence B. Holder, Diane J. Cook:
Learning patterns in the dynamics of biological networks.
977-986

- Xiangliang Zhang, Cyril Furtlehner, Julien Perez, Cécile Germain-Renaud, Michèle Sebag:
Toward autonomic grids: analyzing the job flow with affinity streaming.
987-996

- Yuzhou Zhang, Jianyong Wang, Yi Wang, Lizhu Zhou:
Parallel community detection on large networks with propinquity dynamics.
997-1006

- Elena Zheleva, Hossam Sharara, Lise Getoor:
Co-evolution of social and affiliation networks.
1007-1016

- Lei Zheng, Shaojun Wang, Yan Liu, Chi-Hoon Lee:
Information theoretic regularization for semi-supervised boosting.
1017-1026

- ErHeng Zhong, Wei Fan, Jing Peng, Kun Zhang, Jiangtao Ren, Deepak S. Turaga, Olivier Verscheure:
Cross domain distribution adaptation via kernel mapping.
1027-1036

- Guangyu Zhu, Gilad Mishne:
Mining rich session context to improve web search.
1037-1046

- Jun Zhu, Eric P. Xing, Bo Zhang:
Primal sparse Max-margin Markov networks.
1047-1056

- Qiang Zhu, Xiaoyue Wang, Eamonn J. Keogh, Sang-Hee Lee:
Augmenting the generalized hough transform to enable the mining of petroglyphs.
1057-1066

Industrial track papers
- Josh Attenberg, Sandeep Pandey, Torsten Suel:
Modeling and predicting user behavior in sponsored search.
1067-1076

- Indrajit Bhattacharya, Shantanu Godbole, Ajay Gupta, Ashish Verma, Jeff Achtermann, Kevin English:
Enabling analysts in managed services for CRM analytics.
1077-1086

- Ludmila Cherkasova, Kave Eshghi, Charles B. Morrey III, Joseph Tucek, Alistair C. Veitch:
Applying syntactic similarity algorithms for enterprise information management.
1087-1096

- Wei Chu, Seung-Taek Park, Todd Beaupre, Nitin Motgi, Amit Phadke, Seinjuti Chakraborty, Joe Zachariah:
A case study of behavior-driven conjoint analysis on Yahoo!: front page today module.
1097-1104

- Thomas Crook, Brian Frasca, Ron Kohavi, Roger Longbotham:
Seven pitfalls to avoid when running controlled experiments on the web.
1105-1114

- Srivatsava Daruru, Nena M. Marin, Matt Walker, Joydeep Ghosh:
Pervasive parallelism in data mining: dataflow solution to co-clustering large and sparse Netflix data.
1115-1124

- Xiaowen Ding, Bing Liu, Lei Zhang:
Entity discovery and assignment for opinion mining applications.
1125-1134

- Xiaoxi Du, Ruoming Jin, Liang Ding, Victor E. Lee, John H. Thornton Jr.:
Migration motif: a spatial - temporal pattern mining approach for financial markets.
1135-1144

- Ariel Fuxman, Anitha Kannan, Andrew B. Goldberg, Rakesh Agrawal, Panayiotis Tsaparas, John C. Shafer:
Improving classification accuracy using automatically extracted training data.
1145-1154

- Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang, Zhong Su:
Address standardization with latent semantic association.
1155-1164

- Sonal Gupta, Mikhail Bilenko, Matthew Richardson:
Catching the drift: learning broad matches from clickthrough data.
1165-1174

- Mohammad Al Hasan, W. Scott Spangler, Thomas D. Griffin, Alfredo Alba:
COA: finding novel patents through text analysis.
1175-1184

- Shunsuke Hirose, Kenji Yamanishi, Takayuki Nakata, Ryohei Fujimaki:
Network anomaly detection based on Eigen equation compression.
1185-1194

- Wei Jin, Hung Hay Ho, Rohini K. Srihari:
OpinionMiner: a novel machine learning system for web opinion mining and extraction.
1195-1204

- Jongwuk Lee, Seung-won Hwang, Zaiqing Nie, Ji-Rong Wen:
Query result clustering for object-level search.
1205-1214

- Ming Li, M. Benjamin Dias, Ian H. Jarman, Wael El-Deredy, Paulo J. G. Lisboa:
Grocery shopping recommendations based on basket-sensitive random walk.
1215-1224

- Yan Liu, Jayant R. Kalagnanam, Oivind Johnsen:
Learning dynamic temporal graphs for oil-production equipment monitoring system.
1225-1234

- Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi:
Towards combining web classification and web information extraction: a case study.
1235-1244

- Justin Ma, Lawrence K. Saul, Stefan Savage, Geoffrey M. Voelker:
Beyond blacklists: learning to detect malicious web sites from suspicious URLs.
1245-1254

- Adetokunbo Makanju, A. Nur Zincir-Heywood, Evangelos E. Milios:
Clustering event logs using iterative partitioning.
1255-1264

- Mary McGlohon, Stephen Bay, Markus G. Anderle, David M. Steier, Christos Faloutsos:
SNARE: a link analytic system for graph labeling and risk detection.
1265-1274

- Prem Melville, Wojciech Gryc, Richard D. Lawrence:
Sentiment analysis of blogs by combining lexical knowledge with text classification.
1275-1284

- Noman Mohammed, Benjamin C. M. Fung, Patrick C. K. Hung, Cheuk-kwong Lee:
Anonymizing healthcare data: a case study on the blood transfusion service.
1285-1294

- Kivanc M. Ozonat, Donald Young:
Towards a universal marketplace over the web: statistical multi-label classification of service provider forms with simulated annealing.
1295-1304

- Debprakash Patnaik, Manish Marwah, Ratnesh K. Sharma, Naren Ramakrishnan:
Sustainable operation and management of data center chillers using temporal data mining.
1305-1314

- B. Aditya Prakash, Nicholas Valler, David Andersen, Michalis Faloutsos, Christos Faloutsos:
BGP-lens: patterns and anomalies in internet routing updates.
1315-1324

- D. Sculley, Robert G. Malkin, Sugato Basu, Roberto J. Bayardo:
Predicting bounce rates in sponsored search advertisements.
1325-1334

- Liang Sun, Rinkal Patel, Jun Liu, Kewei Chen, Teresa Wu, Jing Li, Eric Reiman, Jieping Ye:
Mining brain region connectivity for alzheimer's disease study via sparse inverse covariance estimation.
1335-1344

- Junfeng Wang, Chun Chen, Can Wang, Jian Pei, Jiajun Bu, Ziyu Guan, Wei Vivian Zhang:
Can we learn a template-independent wrapper for news article extraction from a single training site?
1345-1354

- Kuansan Wang, Toby Walker, Zijian Zheng:
PSkip: estimating relevance ranking quality from web search clickthrough data.
1355-1364

- Gu Xu, Shuang-Hong Yang, Hang Li:
Named entity mining from click-through data using weakly supervised latent dirichlet allocation.
1365-1374

- Jiang-Ming Yang, Rui Cai, Chunsong Wang, Hua Huang, Lei Zhang, Wei-Ying Ma:
Incorporating site-level knowledge for incremental crawling of web forums: a list-wise strategy.
1375-1384

- Yanfang Ye, Tao Li, Qingshan Jiang, Zhixue Han, Li Wan:
Intelligent file scoring system for malware detection from the gray list.
1385-1394

- Bin Zhou, Daxin Jiang, Jian Pei, Hang Li:
OLAP on search logs: an infrastructure supporting data-driven applications in search engines.
1395-1404

Last update Sat May 18 19:13:32 2013
CET by the DBLP Team —
Data released under the ODC-BY 1.0 license — See also our legal information page