In this series of liveProjects, you’ll expand your understanding of machine learning feature engineering by building a ML application that predicts and diagnoses diabetes. You’ll work to process ML features, identify useful features to train your model, and build a scalable process to score new data. Once you’ve developed these features, you will redesign your solution to make your features reusable using an ML feature store. Each liveProject in this series is focused on a different aspect of machine learning feature engineering, so you can pick and choose what’s most relevant to your work.
Project: Creating Features
In this liveProject, you’ll process raw data to make it ready for a machine learning model to diagnose diabetes rates. You’ll use feature engineering techniques to generate ML features from raw data. To make sense of your data, you will undertake data profiling, exploratory data analysis, analyze independent/dependent variables, and visualize data patterns. You’ll evaluate the correlation between dependent and independent variables to identify relevant features. You’ll even generate additional features as needed. Additionally, you will apply feature engineering techniques such as treating missing values and outliers to make your features ready for model training.
Project: Train and Score with Raw Data
In this liveProject, you’ll train and evaluate a machine learning model for diagnosing diabetes, and set up a pipeline for your model to run effectively. You’ll start by exploring sample data, processing features, and performing common feature engineering techniques for treating outliers or missing data. After dividing your dataset into training and testing data, you’ll train a logistic regression model using scikit-learn. You will then retrain the model with a different set of features. Finally, you’ll pick a model for scoring and build a scoring pipeline. You will test your scoring process on a scoring dataset.
Project: Train and Score with Feature Store
In this liveProject, you’ll train your model and build a scoring pipeline using an ML feature store. You’ll explore a sample data set for diagnosing diabetes, generate new features and store them in a feature store, train and retrain ML models, and build a scoring process. You’ll employ common feature engineering techniques to train the model, then test and retrain it as needed. You’ll also work on setting up a scoring pipeline, and brainstorm ML development using a feature store. In this project, you will learn how to store the features for a machine learning model so they can be reused in other machine learning projects.