3. ICDM 2003: Melbourne, Florida, USA
Proceedings of the 3rd IEEE International Conference on Data Mining (ICDM 2003), 19-22 December 2003, Melbourne, Florida, USA. IEEE Computer Society 2003 ISBN 0-7695-1978-4
Research-Track Regular Papers
Amihood Amir, Reuven Kashi, Nathan S. Netanyahu: Efficient Multidimensional Quantitative Hypotheses Generation. 3-10
Francesco Bonchi, Fosca Giannotti, Alessio Mazzanti, Dino Pedreschi: ExAMiner: Optimized Level-wise Frequent Pattern Mining with Monotone Constraint. 11-18
Fabien De Marchi, Jean-Marc Petit: Zigzag: a new algorithm for mining large inclusion dependencies in database. 27-34
Mukund Deshpande, Michihiro Kuramochi, George Karypis: Frequent Sub-Structure-Based Approaches for Classifying Chemical Compounds. 35-42
Joseph Elble, Cinda Heeren, Leonard Pitt: Optimized Disjunctive Association Rules via Sampling. 43-50
Wei Fan, Haixun Wang, Philip S. Yu, Sheng Ma: Is random model better? On its accuracy and efficiency. 51-58
Lewis Frey, Douglas H. Fisher, Ioannis Tsamardinos, Constantin F. Aliferis, Alexander R. Statnikov: Identifying Markov Blankets with Decision Tree Induction. 59-66
Robert Gwadera, Mikhail J. Atallah, Wojciech Szpankowski: Reliable Detection of Episodes in Event Sequences. 67-74
Chihli Hung, Stefan Wermter: A Dynamic Adaptive Self-Organising Hybrid Model for Text Clustering. 75-82
Akihiro Inokuchi, Hisashi Kashima: Mining Significant Pairs of Patterns from Graph Structures with Class Labels. 83-90
Huidong Jin, Man Leung Wong, Kwong-Sak Leung: Scalable Model-based Clustering by Working on Data Summaries. 91-98
Hillol Kargupta, Souptik Datta, Qi Wang, Krishnamoorthy Sivakumar: On the Privacy Preserving Properties of Random Data Perturbation Techniques. 99-106
Kawamae Noriaki, Takeya Mukaigaito, Hanaki Miyoshi: Semantic Log Analysis Based on a User Query Behavior Model. 107-114
Eamonn J. Keogh, Jessica Lin, Wagner Truppel: Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research. 115-122
Jeremy Z. Kolter, Marcus A. Maloof: Dynamic Weighted Majority: A New Ensemble Method for Tracking Concept Drift. 123-130
Aleksandar Lazarevic, Ramdev Kanapady, Chandrika Kamath, Vipin Kumar, Kumar K. Tamma: Localized Prediction of Continuous Target Variables Using Hierarchical Clustering. 139-146

Qi Li, Jieping Ye, Chandra Kambhamettu: Spatial Interest Pixels (SIPs): Useful Low-Level Features of Visual Media Data. 163-170
Shou-De Lin, Hans Chalupsky: Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis. 171-178
Bing Liu, Yang Dai, Xiaoli Li, Wee Sun Lee, Philip S. Yu: Building Text Classifiers Using Positive and Unlabeled Examples. 179-188
Levon Lloyd, Steven Skiena: Parsing Without a Grammar: Making Sense of Unknown File Formats. 195-202
Srujana Merugu, Joydeep Ghosh: Privacy-preserving Distributed Clustering using Generative Models. 211-218
Taneli Mielikäinen: Change Profiles. 219-226
Olfa Nasraoui, Cesar Cardona Uribe, Carlos Rojas Coronel, Fabio A. González: TECNO-STREAMS: Tracking Evolving Clusters in Noisy Data Streams with a Scalable Immune System Learning Model. 235-242
Cheong Hee Park, Haesun Park: Efficient Nonlinear Dimension Reduction for Clustered Data Using Kernel Functions. 243-250
Dmitry Pavlov: Sequence Modeling with Mixtures of Conditional Maximum Entropy Distributions. 251-258
Jian Pei, Xiaoling Zhang, Moonjung Cho, Haixun Wang, Philip S. Yu: MaPle: A Fast Algorithm for Maximal Pattern-based Clustering. 259-266
Kang Peng, Slobodan Vucetic, Bo Han, Hongbo Xie, Zoran Obradovic: Exploiting Unlabeled Data for Improving Accuracy of Predictive Data Mining. 267-274
Alexandrin Popescul, Lyle H. Ungar, Steve Lawrence, David M. Pennock: Statistical Relational Learning for Document Mining. 275-282
Saharon Rosset, Einat Neumann: Integrating Customer Value Considerations into Predictive Modeling. 283-290
Assaf Schuster, Ran Wolff, Dan Trock: A High-Performance Distributed Algorithm for Mining Association Rules. 291-298
Xingzhi Sun, Maria E. Orlowska, Xue Li: Introducing Uncertainty into Pattern Discovery in Temporal Event Sequences. 299-306
Zehang Sun, George Bebis, Ronald Miller: Evolutionary Gabor Filter Optimization with Application to Vehicle Detection. 307-314
Einoshin Suzuki, Takeshi Watanabe, Hideto Yokoi, Katsuhiko Takabayashi: Detecting Interesting Exceptions from Medical Test Data with Visual Summarization. 315-322
Fengzhan Tian, Hongwei Zhang, Yuchang Lu: Learning Bayesian Networks from Incomplete Data Based on EMI Method. 323-330
Shusaku Tsumoto, Shoji Hirano: Visualization of Rule's Similarity using Multidimensional Scaling. 339-346
Jörg A. Walter, Jörg Ontrup, Daniel Wessling, Helge Ritter: Interactive Visualization and Navigation in Large Data Collections using the Hyperbolic Space. 355-362
Raymond Chi-Wing Wong, Ada Wai-Chee Fu, Ke Wang: MPIS: Maximal-Profit Item Selection with Cross-Selling Considerations. 371-378
Yongqiao Xiao, Jenq-Foung Yao, Zhigang Li, Margaret H. Dunham: Efficient Data Mining for Maximal Frequent Subtrees. 379-386
Hui Xiong, Pang-Ning Tan, Vipin Kumar: Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution. 387-394
Guizhen Yang, Saikat Mukherjee, I. V. Ramakrishnan: On Precision and Recall of Multi-Attribute Data Extraction from Semistructured Sources. 395-402
Yinghui Yang, Balaji Padmanabhan: Segmenting Customer Transactions Using a Pattern-Based Clustering Approach. 411-418
Jieping Ye, Ravi Janardan, Cheong Hee Park, Haesun Park: A new optimization criterion for generalized discriminant analysis on undersampled problems. 419-426
Jeonghee Yi, Tetsuya Nasukawa, Razvan C. Bunescu, Wayne Niblack: Sentiment Analyzer: Extracting Sentiments about a Given Topic using Natural Language Processing Techniques. 427-434
Bianca Zadrozny, John Langford, Naoki Abe: Cost-Sensitive Learning by Cost-Proportionate Example Weighting. 435-
Hua-Jun Zeng, Xuanhui Wang, Zheng Chen, Hongjun Lu, Wei-Ying Ma: CBC: Clustering Based Text Classification Requiring Minimal Labeled Data. 443-450
Bin Zhang: Regression Clustering. 451-
Research-Track Short Papers
Reda Alhajj, Mehmet Kaya: Integrating Fuzziness into OLAP for Multidimensional Fuzzy Association Rules Mining. 469-472
Amihood Amir, Reuven Kashi, Nathan S. Netanyahu, Daniel A. Keim, Markus Wawryniuk: Analyzing High-Dimensional Data by Subspace Validity. 473-476
Aijun An, Shakil M. Khan, Xiangji Huang: Objective and Subjective Algorithms for Grouping Association Rules. 477-480
Tassos Argyros, Charis Ermopoulos: Efficient Subsequence Matching in Time Series Databases Under Time and Amplitude Transformations. 481-484
James Bailey, Thomas Manoukian, Kotagiri Ramamohanarao: A Fast Algorithm for Computing Hypergraph Transversals and its Application in Mining Emerging Patterns. 485-488
Daniel Barbará, Carlotta Domeniconi, Ning Kang: Mining Relevant Text from Unlabelled Documents. 489-492
Julien Blanchard, Fabrice Guillet, Henri Briand: A User-driven and Quality-oriented Visualization for Mining Association Rules. 493-496
Doina Caragea, Dianne Cook, Vasant Honavar: Towards Simple, Easy-to-Understand, yet Accurate Classifiers. 497-500
Ping Chen, Chenyi Hu, Wei Ding, Heloise Lynn, Yves Simon: Icon-based Visualization of Large High-Dimensional Datasets. 505-508
Frans Coenen, Paul H. Leng, Shakil Ahmed: T-Trees, Vertical Partitioning and Distributed Association Rule Mining. 513-516
Inderjit S. Dhillon, Yuqiang Guan: Information Theoretic Clustering of Sparse Co-Occurrence Data. 517-520
François Fouss, Marco Saerens, Jean-Michel Renders: Links Between Kleinberg's Hubs and Authorities, Correspondence Analysis, and Markov Chains. 521-524
Pasi Fränti, Olli Virmajoki, Ville Hautamäki: Fast PNN-based Clustering Using K-nearest Neighbor Graph. 525-528
Lawrence O. Hall, Kevin W. Bowyer, Robert E. Banfield, Divya Bhadoria, W. Philip Kegelmeyer, Steven Eschrich: Comparing Pure Parallel Ensemble Creation Techniques Against Bagging. 533-536
Edwin O. Heierman III, Diane J. Cook: Improving Home Automation by Discovering Regularly Occurring Device Usage Patterns. 537-540
Chun-Nan Hsu, Hao-Hsiang Chung, Han-Shen Huang: The Hybrid Poisson Aspect Model for Personalized Shopping Recommendation. 545-548
Jin Huang, Jingjing Lu, Charles X. Ling: Comparing Naive Bayes, Decision Trees, and SVM with AUC and Accuracy. 553-556
Joarder Kamruzzaman, Ruhul A. Sarker, Iftekhar Ahmad: SVM Based Models for Predicting Foreign Currency Exchange Rates. 557-560
Mehmet Kaya, Reda Alhajj: Facilitating Fuzzy Association Rules Mining by Using Multi-Objective Genetic Algorithms for Automated Clustering. 561-564
Daniel A. Keim, Christian Panse, Mike Sips, Stephen C. North: PixelMaps: A New Visual Data Mining Approach for Analyzing Large Spatial Data Sets. 565-568
Mark-A. Krogel, Tobias Scheffer: Effectiveness of Information Extraction, Multi-Relational, and Semi-Supervised Learning for Predicting Functional Properties of Genes. 569-572
Jeremy Kubica, Andrew W. Moore, Jeff G. Schneider: Tractable Group Detection on Large Link Data Sets. 573-576
Longin Jan Latecki, Rajagopal Venugopal, Marc Sobel, Steve Horvat: Tree-structured Partitioning Based on Splitting Histograms of Distances. 577-580
Young-Koo Lee, Won-Young Kim, Y. Dora Cai, Jiawei Han: CoMine: Efficient Mining of Correlated Patterns. 581-584
Tao Li, Shenghuo Zhu, Mitsunori Ogihara: Using Discriminant Analysis for Multi-class Classification. 589-592

Matthew V. Mahoney, Philip K. Chan: Learning Rules for Anomaly Detection of Hostile Network Traffic. 601-604
Frédéric Maire: An Algorithm for the Exact Computation of the Centroid of Higher Dimensional Polyhedra and its Application to Kernel Machines. 605-608
Jennifer Neville, David Jensen, Brian Gallagher: Simple Estimators for Relational Bayesian Classifiers. 609-612
Stanley R. M. Oliveira, Osmar R. Zaïane: Protecting Sensitive Knowledge By Data Sanitization. 613-616
Matthew Eric Otey, Chao Wang, Srinivasan Parthasarathy, Adriano Veloso, Wagner Meira Jr.: Mining Frequent Itemsets in Distributed and Dynamic Databases. 617-620
Hanchuan Peng, Chris H. Q. Ding: Structure Search and Stability Enhancement of Bayesian Networks. 621-624
Huseyin Polat, Wenliang Du: Privacy-Preserving Collaborative Filtering Using Randomized Perturbation Techniques. 625-628
Sameer S. Pradhan, Kadri Hacioglu, Wayne Ward, James H. Martin, Daniel Jurafsky: Semantic Role Parsing: Adding Semantic Structure to Unstructured Text. 629-632
Michèle Sebag, Jérôme Azé, Noël Lucas: Impact Studies and Sensitivity Analysis in Medical Data Mining with ROC-based Genetic Learning. 637-640
Tomoyuki Shibata, Takekazu Kato, Toshikazu Wada: K-D Decision Tree: An Accelerated and Memory Efficient Nearest Neighbor Classifier. 641-644
Horia-Nicolai L. Teodorescu, Lucian Iulian Fira: A Hybrid Data-Mining Approach in Genomics and Text Structures. 649-652
Kai Ming Ting, Regina Jing Ying Quek: Model Stability: A key factor in determining whether an algorithm produces an optimal model from a matching distribution. 653-656
Jyh-Jong Tsay, Hsuan-Yu Chen, Chi-Feng Chang, Ching-Han Lin: Enhancing Techniques for Efficient Topic Hierarchy Integration. 657-660
Shusaku Tsumoto, Shoji Hirano: Pattern Discovery based on Rule Induction and Taxonomy Generation. 661-664
Juan D. Velásquez, Hiroshi Yasuda, Terumasa Aoki: Combining the web content and usage mining to understand the visitor behavior in a web site. 669-672
Ricardo Vilalta, Murali-Krishna Achari, Christoph F. Eick: Class Decomposition via Clustering: A New Framework for Low-Variance Classifiers. 673-676
Arkadiusz Wojna: Center-Based Indexing for Nearest Neighbors Search. 681-684
Qiang Yang, Jie Yin, Charles X. Ling, Tielin Chen: Postprocessing Decision Trees to Extract Actionable Knowledge. 685-688
Hwanjo Yu: General MC: Estimating Boundary of Positive Class from Small Positive Data. 693-696
Ching-Huang Yun, Kun-Ta Chuang, Ming-Syan Chen: Clustering Item Data Sets with Association-Taxonomy Similarity. 697-700
Peng Zhang, Jing Peng, Carlotta Domeniconi: Dimensionality Reduction Using Kernel Pooled Local Discriminant Information. 701-704
Zhaohui Zheng, Rohini K. Srihari, Sargur N. Srihari: A Feature Selection Framework for Text Filtering. 705-708
Hongwei Zhu, Otman A. Basir: A K-NN Associated Fuzzy Evidential Reasoning Classifier with Adaptive Neighbor Selection. 709-
Industry-Track Papers
Frank Dellmann, Holger Wulff, Stefan Schmitz: Findings from a Practical Project Concerning Web Usage Mining. 715-718
Qinghua Guo, Maggi Kelly, Catherine Graham: Predicting distribution of a new forest disease using one-class SVMs. 719-722
Rajat Gupta, B. V. L. Narayana, P. Krishna Reddy, G. V. Ranga Rao, C. L. L. Gowda, Y. V. R. Reddy, Garimella Rama Murthy: Understanding Helicoverpa armigera Pest Population Dynamics related to Chickpea Crop Using Neural Networks. 723-726
Jutta Kreyß, Steve Selvaggio, Michael White, Zach Zakharian: Text Mining for a Clear Picture of Defect Reports: A Praxis Report. 727-730
Mingkun Li, Shuo Feng, Ishwar K. Sethi, Jason Luciow, Keith Wagner: Mining Production Data with Neural Network & CART. 731-734
Byung-Hoon Park, George Ostrouchov, Gong-Xin Yu, Al Geist, Andrey Gorin, Nagiza F. Samatova: Inference of Protein-Protein Interactions by Unlikely Profile Pair. 735-738
Tu Minh Phuong, Doheon Lee, Kwang Hyung Lee: Regulatory Element Discovery Using Tree-structured Models. 739-742
Choh-Man Teng: Applying Noise Handling Techniques to Genomic Data: A Case Study. 743-746
Kaidi Zhao, Bing Liu, Thomas M. Tirpak, Andreas Schaller: Detecting Patterns of Change Using Enhanced Parallel Coordinates Visualization. 747-



