Practical Machine Learning in Finance is a hands-on course that teaches you how to apply machine learning to real investment problems. You’ll learn by doing—building classification models, engineering features, handling class imbalance, evaluating performance, and avoiding common pitfalls. Designed for finance professionals and students, the course focuses on practical implementation, not just theory. By the end, you’ll know how to take a real-world dataset and turn it into a working ML model you can trust.
In this section, we will cover one of the most important (if not THE most important) part of machine learning: creating good features. Creating features is not just science - it is also an art. This section will teach some basic concepts about how to create features, and then go through the process of creating the features used throughout the course.
In this section, we will apply what we learned in the Introduction to Feature Engineering section and design the features we will use in this case study: Predicting if the S&P 500 will go up or down next month.
This section will present one of the most important concepts in quantitative finance and machine learning: linear regression. Regression tasks can either be used to predict quantities (which we call regression tasks) or to predict a category (think up or down; we call these tasks classification). Understanding the key aspects of regression gives you the foundation for all of all machine learning.
The power of classification tasks comes from understanding the confusion matrix. This section will introduce you to evaluation metrics such as accuracy, precision and F1 scores. Understanding what these metrics are and when to apply them is your key to success!
In this section, we will address our case study with our first modelling algorithm: logistic regression.
In this section, we will show how to deal with one of the main weaknesses with linear models: how to address non-linearity. GAMs take the first step by using splines to better fit the data, without increasing too much complexity.
While GAMs can do a great job addressing non-linearity, they are still limited when dealing with data that has complex relationships. Decision trees are a very powerful next step.
While decision tress are a huge step-up in addressing non-linearity, they do have many weaknesses. Random Forests address many of them.