Course Archives Documentation Research and Training Centre Unit
Course: Data and Text Mining
Time: Currently not offered
Syllabus
Past Exams

Syllabus:
Module 1: Basic Introduction to Data mining
Unit 01: Data Mining: Introduction, Definitions, Issues and Challenges, Real World Applications
Unit 02: KDD vs DM, DBMS vs DM, DM techniques
Unit 03: Data warehousing and OLAP: Data warehousing: Introduction, Definitions, Multidimen sional data model, OLAP and OLAP Engine
Unit 04: Location, Spread, Shape and Dependency
Unit 05: Graphic display of basic statistical description: Boxplot, Histogram, Quantile plot, Quan tile-quantile (q-q) plot, Scatter plot
Unit 06: Probability Density Function and Variance-Covariance Matrix: Probability Density Function, Variance-Covariance Matrix
Unit 07: Various Distances, Standardization and Normalization: Matric Space, Similarity and Dis-similarity measures, Minkowski Distance, Euclidean Distance, Mahalanobis Distance, Standardization and Normalization
Unit 08: Association rules: Introduction, Methods to discover association rules, Related Algorithms
Unit 09: Decision trees: Tree construction principle, Decision tree construction algorithm, Presorting
Unit 10: Principal Component analysis, Cumulative distribution function and Confusion Matrix
Module 2: Classification and Clustering Methods for Data Mining and Fuzzy logic
Unit 11: Classification and Classification Algorithms: Introduction to classification, Bayes Decision Rules, KNN, Other Classification Algorithms
Unit 12: Clustering and Clustering Algorithms: Introduction to Clustering, K- means, DBSCAN, Other Clustering Algorithms
Unit 13: Fuzzy Logic
Module 3: Data Mining Application
Unit 14: Web mining: Content, structure and usage mining, Text mining, Image and multimedia mining

Reference Texts:
1. James, G., D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning with Application to R, Springer, New York.
2. Witten, I. H., E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann.
3. Montgomery, D. C., and G. C. Runger, Applied Statistics and Probability for Engi- neers. John Wiley & Sons
4. Samueli G., N. R. Patel, and P. C. Bruce, Data Mining for Business Intelligence, John Wiley & Sons, New York.
5. Hastie, T., R. T. Jerome, and H. Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction, Springer.
6. Colleen Mccue, Data Mining and Predictive Analysis: Intelligence Gathering and Crime Analysis, Elsevier
7. Jiawei Han, Micheline Kamber Data Mining Concepts and Techniques, Second Edi- tion, Elsevier

Top of the page

Past Exams
Midterm
 23.pdf 24.pdf
Semestral Supplementary and Back Paper

Top of the page

[Indian Statistical Institute]