Machine Learning Using Python

MEAFA Professional Development Workshop

Welcome to the material page for the Machine Leaning section of the MEAFA professional development workshop on Machine Learning Using Python.


Setting up Python for the workshop

Instructions for setting up a Python environment. Even though computers will be provided, you are highly encouraged to use your own laptop so that you are able to immediately continue working with these tools upon the conclusion of the workshop. We will provide assistance for the installation in the first day of the workshop, if you require it.

Installing additional Python packages. The workshop will rely on the additional machine learning and data visualisation packages listed here.


Recommended reading

A Few Useful Things to Know About Machine Learning (Pedro Domingos). An overview of the essential lessons from applied machine learning. We will explore these concepts extensively in the workshop.

Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (Aurélien Géron). My recommendation for those who would like to have a book reference for the topics covered in the workshop.


Workshop Resources

Datasets for the Python section of the workshop

GitHub repository for Machine Learning Using Python

Datasets for the machine learning section of the workshop

Statistical learning module


Lessons

Lesson 1: Introduction to Machine Learning.

Lesson Notes

Notebook Viewer

Lesson 2: Regularised Linear Models

Lesson Notes

Notebook Viewer

Lesson 3: Naive Bayes.

Lesson Notes

Notebook Viewer

Lesson 4: Logistic Regression and Optimal Decisions.

Lesson Notes

Notebook Viewer

Lesson 5: Decision Trees and Random Forests.

Lesson Notes

Notebook Viewer

Lesson 6: Boosting.

Lesson Notes

Notebook Viewer

Suggested reading: Introduction to Boosted Trees (from the XGBoost documentation).

Lesson 7: Ensemble Learning and Model Stacking.

Lesson Notes

Notebook Viewer

Lesson 8: Application to a Kaggle Regression Competition.

Notebook Viewer

Lesson 9: Support Vector Machines

Lesson Notes

Notebook Viewer

Lesson 10: Neural Networks.

Lesson Notes

Notebook Viewer (Regression)

Notebook Viewer (Classification)


References

The lesson notes draw material from the following references, including some figures.

The Elements of Statistical Learning by Trevor Hastie and Robert Tibshirani.

An Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani.

Machine Learning: A Probabilistic Perspective by Kevin P. Murphy.


Directions for further study

Deep Learning by Google

Introduction to Apache Spark and AWS, Big Data Analysis with Apache Spark, or another similar course. For those who want to apply machine learning to massive datasets.

Unsupervised Learning