## Teaching - Postgraduate

Draft handbook for MSc Dissertation Projects, summer of 2016. Project proposals may or may not appear on the School blog

Course material for ID5059 Knowledge Discovery and Data Mining, Semester 2 of 2014-15. The coursework was co-developed with Carl Donovan.

Lecture notes

- Lecture 01 - Introduction - Slides - Notes - Breiman paper
- Lecture 02 - Basis Functions - Slides - Notes - Kondor paper
- Lecture 03 - Model Fit Measures - Slides - Notes - Stokes paper
- Lecture 04 - Model Selection - Slides - Notes - Efron paper
- Lecture 05 - Tree-based Methods - Slides - Notes - Wang et al. paper
- Lecture 06 - Regression and Decision Trees - Slides - Notes- Anderson et al. paper
- Lecture 08 - Regression Trees - Slides
- Lecture 09 - GLM & GAM Regression - Slides
- Lecture 10 - Classification Trees - Slides - Notes
- Lecture 11 - Decision Tree Worked Example - Slides - Notes
- Lecture 12 - Complexity and Numerics - Slides - Notes
- Lecture 13 - Neural Nets I - Slides - Notes
- Lecture 14 - Neural Nets II - Slides - Notes
- Lecture 15 - Bayesian Classification - Slides - Notes- Hsu et al. paper- Leung slides
- Lecture 16 - Classification Evaluation & ROC - Slides - Notes
- Lecture 17 - ROC, AUC & Lift- Slides - Notes
- Lecture 18 - Bootstrapping - Slides - Notes
- Lecture 19 - Bagging - Slides - Notes
- Lecture 20 - Boosting - Slides - Notes - Freund & Schapire paper
- Lecture 21 - Support Vector Machines I - Slides - Notes
- Lecture 22 - Support Vector Machines II - Slides - Notes

Practicals

- Practical 01 - Auto MPG - Spec. - Resources - Data
- Practical 02 - Purchase Probability - Specification, due dates and tips

Tutorials

- Tutorial 01 - Titanic Survival - zip file
- Tutorial 02 - SVM with linear kernel - R code
- Tutorial 02 - another SVM with linear kernel - R code
- Tutorial 02 - SVM with radial kernel - R code

The Elements of Statistical Learning by Hastie, Tibshirani & Friedman is available from Stanford University. There are other useful resources at the same location, including sample R files.

R and Data Mining: Examples and Case Studies by Yanchang Zhao. This is the PDF of a textbook containing many worked examples (in R) together with detailed explanations of the theory and the the technicalities involved.

R material for non-geeks is available from the University of California at Davis.

As this appears not to be in the old examp paper repository, I've made available the 2010-20 MT5759 exam. **Warning!** Both the structure and content are likely to change this year; this document gives you some insight into the type of question set for Maths masters students.

Online R tutorials from Data Camp, Code School, and R Studio