Related Work in Machine Learning

CFP: Data Mining, Analytics, Big Data, Data Science, and Knowledge Discovery

Books

Mining of Massive Datasets, Jure Leskovec, Anand Rajaraman, Jeff Ullman
A Course in Machine Learning, Hal Daumé III
Elements of Statistical Learning, Trevor Hastie, Robert Tibshirani, Jerome Friedman
Pattern Recognition and Machine Learning, C. Bishop

Conferences (Conferences in CS)

Intl. Conf. on Machine Learning (ICML) [CFP: ~Jan]
ACM Intl. Conf. on Knowledge Discovery and Data Mining (KDD) [CFP: ~Mar]
IEEE Intl. Conf. on Data Mining (ICDM) [CFP: ~May]
SIAM Intl. Conf. on Data Mining (SDM) [CFP: ~Oct]
Neural Information Processing Systems (NIPS) [CFP: ~Jul]
National Conf. AI (AAAI) and Innovative Applications of AI (IAAI) [CFP: ~Sep]
Intl. Joint Conf. on AI (IJCAI) [CFP: ~Jan]
Conference on Computational Learning Theory (COLT) [CFP: ~Jan]
European Conference on Machine Learning (ECML) [CFP: ~Mar]
Genetic and Evolutionary Computation Conference (GECCO) [CFP: ~Apr]
Inductive Logic Programming (ILP) [CFP: ~Jan]
Florida AI Research Society (FLAIRS) [CFP: ~Oct]

Journals (Journals in CS)

Projects

Papers

Decisions Trees

Decision Tree Induction Based on Efficient Tree Restructuring P. Utgoff, N. Berkman & J. Clouse. Machine Learning, 29(1):5-44, 1997.

Rules

Learning Decision Rules by Randomized Iterative Local Search, Chisholm, M. and Tadepalli, P. ICML, 2002.
Lightweight Rule Induction S. Weiss & N. Indurkhya. ICML, p.1135-42. 2000.
A Simple, Fast, and Effective Rule Learner W. Cohen & Y. Singer, Proc. AAAI, 1999.
Beyond Market Baskets: Generalizing Association Rules to Dependence Rules, C. Silverstein, S. Brin & R. Motwani, Data Mining and Knowledge Discovery, January 1998, pp. 39-68.
Fast Effective Rule Induction W. Cohen. Proc. ICML, 1995.
Fast Algorithms for Mining Association Rules, R. Agrawal & R. Srikant Proc. VLDB, 1994.
Mining Associations between Sets of Items in Massive Databases R. Agrawal, T. Imielinski and A. Swami. Proc. SIGMOD, p207-216, 1993.
The CN2 Induction Algorithm, P. Clark & T. Niblett. Machine Learning, 3(4), p261-283, 1989.

Clustering

TraClass: Trajectory Classification Using Hierarchical Region-Based and Trajectory-Based Clustering. J. Lee, J. Han, X. Li, H. Gonzalez. VLDB, pp 140-149, 2008.
Graph-Based Hierarchical Conceptual Clustering, I. Jonyer, L. Holder & D. Cook, J. Machine Learning Research, 2:19-43, 2001.
LOF: Identifying Density-Based Local Outliers, M. Breunig, H. Kriegel, R. Ng & J. Sander, SIGMOD, pp. 93-104, 2000.
ROCK: A Robust Clustering Algorithm for Categorical Attributes, S. Guha, R. Rastoqi & K Shim, Information Systems, 25(5):345-366, 2000.
Learning to match and cluster large high-dimensional data sets for data integration, W. Cohen & J. Richman, KDD, 2002.
CHAMELEON: Hierarchical Clustering Algorithm using Dynamic Modeling G. Karypis, E. Han & V. Kumar. IEEE Computer, 1999.
Clustering categorical data: An approach based on dynamical systems, D. Gibson, J. Kleinberg & P. Raghavan, Proc. VLDB, 1998.
An Efficient Approach to Clustering in Large Multimedia Databases with Noise, A. Hinneburg & D. Kelm. Proc. KDD, 1998.

Probabilistic Models

Probabilistic Models of Relational Structure, L. Getoor, N. Friedman, D. Koller & B. Taskar. ICML, 2001.

Learning Formal Languages, Automata

MDL-Based Context-Free Graph Grammar Induction, I. Jonyer, L. Holder & D. Cook, Proc. FLAIRS, 2003.
Learning context-free grammars with a simplicity bias, P. Langley & S. Stromsten, Proc. ECML (pp. 220-228) 2000.

Cost-sensitive and imbalance class distribution

A Fully Distributed Framework for Cost-Sensitive Data Mining, W. Fan, h. Wang. P. Yu, S. Stolfo, Proc. ICDCS, p. 445-446, 2002.
Reducing multiclass to binary by coupling probability estimates, B. Zadrozny, Advances in Neural Information Processing Systems 14 (NIPS*2001), 2001.
Learning and Making Decisions When Costs and Probabilities are Both Unknown, B. Zadrozny & C. Elkan, Proc. of the Seventh Intl. Conf. on Knowledge Discovery and Data Mining, 2001.
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers, C. Elkan, Proceedings of the Eighteenth Intl. Conf. on Machine Learning , 2001.
The Foundations of Cost-Sensitive Learning, C. Elkan, Proc. of the Seventeenth Intl. Joint Conf. on Artificial Intelligence, 2001.

Outlier/Anomaly Detection

Trajectory Outlier Detection: A Partition-and-Detect Framework. J. Lee, J. Han, and X. Li, ICDE, 2008.
ROAM: Rule- and Motif-Based Anomaly Detection in Massive Moving Object Data Sets. X. li, J. Han, S. Kim & H. Gonzalez. SIAM Intl. Conf. Data Mining (SDM), 2007.
Neighborhood based detection of anomalies in high dimensional spatio-temproal sensor datasets. N. Adam, V. Janeja & V. Atluri. ACM Symp. on Applied Computing (SAC), pp. 576-583, 2004.
Outlier Detection for High Dimensional Data, C. Aggarwal & P. Yu, SIGMOD, 2001.
LOF: Identifying Density-Based Local Outliers, M. Breunig, H. Kriegel, R. Ng & J. Sander, SIGMOD, pp. 93-104, 2000.

ML and Natural Language Processing

Grounded Spoken Language Acquisition: Experiments in Word Learning D. Roy, IEEE Transactions on Multimedia, 2003 (in press).
An unsupervised algorithm for segmenting categorical timeseries into episodes P. Cohen and B. Heeringa and N. Adams, Proc. IEEE Intl Conf. Data Mining, 2002.
Learning Words and Syntax for a Visual Description Task D. Roy, Computer Speech and Language, 16(3), 2002.
Learning Visually Grounded Words and Syntax of Natural Spoken Language D. Roy, Evolution of Communication 4(1), p33-56, 2000/01.
An efficient, probabilistically sound algorithm for segmentation and word discovery, M. Brent. Machine Learning, 34(1):71-106, 1999.
Speech segmentation and word discovery: A computational perspective, M. Brent. Trends in Cognitive Science, 3:294-301, 1999.
Similarity-Based Models of Word Cooccurrence Probabilities, I. Dagan, L. Lee & F. Pereira, Machine Learning, 34(1):43-69, 1999.
A maximum entropy approach to natural language processing A. Berger, S. Pietra, and V. Pietra, Computational Linguistics, 22(1), 1996.
Good bigrams, C. Johansson, In Proceedings of COLING-96, pages 592--597, 1996.
Parsing a Natural Language Using Mutual Information Statistics D. Magerman & M. Marcus, AAAI, p984-989, 1990.

ML and Time Series (Temporal Data)

Estimating the number of segments in time series data using permutation tests, K. Vasko & H. Toivonen, ICDM, 2002.
An Online Algorithm for Segmenting Time Series, E. Keogh, S. Chu, D. Hart, & M. Pazzani. IEEE Intl. Conf. Data Mining, p289-296, 2001.
Event Detection from Time Series Data V. Guralnik & K. Srivastava KDD, 1999.
Clustering Time Series with Hidden Markov Models and Dynamic Time Warping, T. Oates, L. Firoiu & P. Cohen. IJCAI Workshop on Sequence Learning, 1999.
Minimum Message Length Segmentation, J. Oliver. PAKDD, p222-233, 1998.
Learning to Classify Sensor Data, S. Manganaris. IJCAI Workshhop on Machine Learning in Engineering, 1995.
Learning Time Series for Intelligent Monitoring S. Manganaris & D. Fisher. Third Intl. Symp. on AI, Robotics, and Automation for Space p71-74, 1994.

Process models

Discovering ecosystem models from time-series data, George, D., Saito, K., Langley, P., Bay, S., & Arrigo, K, Proceedings of the Sixth Intl. Conf. on Discovery Science.
Robust induction of process models from time-series data., Langley, P., George, D., Bay, S., & Saito, K. , Proceedings of the Twentieth Intl. Conf. on Machine Learning (pp. 432-439), 2003.

Statistics

Selecting the Right Interestingness Measure for Association Patterns, P. Tan, V. Kumar & J. Srivastava, Proc of the Eighth ACM SIGKDD (KDD-2002)
Zipf, Power-laws, and Pareto - a ranking tutorial, L. Adamic, Xerox PARC, 2000.
Interestingness Measures for Association Patterns, P. Tan & V. Kumar, KDD 2000 Workshop on Postprocessing in Machine Learning and Data Mining.
Efficient Bayesian Parameter Estimation in Large Discrete Domains, N. Friedman & Y. Singer, NIPS, 1999.
An empirical study of smoothing techniques for language modeling S. Chen & J. Goodman, Proceedings of the 34th Meeting of the Association for Computational Linguistics, pp 310--31, 1996. TR version: TR-10-98, Computer Science Group, Harvard University, 1998.
Good-Turing Smoothing Without Tears W. Gale, 1994. [W. Gale & G. Sampson, Good-Turing frequency estimation without tears, Journal of Quantitative Linguistics 2:217-37, 1995]
What's Wrong with Adding One?, W. Gale and K. Church, In N. Oostdijk and P. de Haan (eds.), Corpus-Based Research into Language: In honour of Jan Aarts, Rodopi, Amsterdam, pp. 189-200, 1994.
The zero frequency problem: estimating the probabilities of novel events in adaptive text compression I. Witten and T. Bell, CS Tech Report: 1989-347-09, Univ. of Calgary, 1989. [also in IEEE Trans. on Info. Theory, 37(4):1085-1094, 1991]

Information Theory

A mathematical theory of communication, C. Shannon, Bell System Technical Journal, vol. 27, pp. 379-423 and 623-656, July and October, 1948.

Philip Chan, pkc@cs.fit.edu

Last modified: Thu Jun 5 15:31:22 EDT 2008

, , 20.