A photograph of a pasta machine making spaghetti symbolizing how SageMaker unsupervised learning algorithms process tabular data

SageMaker supervised algorithms

There are five SageMaker supervised algorithms for tabular data. DeepAR Forecasting uses Deep Learning for financial forecasting. Linear Learner is good for regression problems. Factorization Machines can be used for the same purpose, but can handle data with gaps and holes better. K-Nearest Neighbor is good at categorising data. XGBoost can predict if an item belongs to a category.

These revision notes are part of subdomain 3.2 Select the appropriate model(s) for a given machine learning problem of the exam syllabus.

A photograph of boys playing Rugby and being lifted up in the air to symbolize the SageMaker XGBoost algorithm
Modeling (Domain 3)

XGBoost Algorithm

XGBoost Algorithm stands for eXtreme Gradient Boosting. XGBoost uses ensemble learning, which is also called boosting. The results of multiple models are grouped together to produce a better fit to the training data. Each decision tree model is added using the prediction errors of previous models to improve the fit to the training data. XGBoost…

A photograph of a washing line with pegs to symbolize the SageMaker Linear Learner algorithm
Modeling (Domain 3)

Linear Learner Algorithm

Linear Learner Algorithm is a Supervised Learning algorithm that can be used to solve three types of problems: Binary classification; Multi-class classification; and Regression. The algorithm is trained with lists of data comprising a high dimensional vector x and a label y to learn the equation of the line. The Linear Learner Algorithm uses Stochastic…

A photograph of hands on a table to symbolize the SageMaker K-Nearest Neighbor algorithm
Modeling (Domain 3)

K-Nearest Neighbors Algorithm

The K-Nearest Neighbors Algorithm is used to place data into a category for example in recommendation applications used for recommending products on Amazon, articles on Medium, movies on Netflix, or videos on YouTube. It returns results based on the nearest training data points to the sample datapoint, also called nearest neighbors.  The K-Nearest Neighbors algorithm…

A photograph of cheese with holes to symbolize data with gaps and holes that can be processed by the SageMaker Factorization Machines algorithm
Modeling (Domain 3)

Factorization Machines Algorithm

The Factorization Machines Algorithm has two modes: Classification and Regression. Classification is a binary method that returns either one or zero and a label which is a number. The Regression mode returns the predicted value. Factorization Machines are a good choice for high dimensional, sparse datasets. Common uses are web page click prediction and item…

Summary

SageMaker has five built-in algorithms for tabular data that use Supervised Learning. The use cases overlap, but each algorithm has it’s own features that may make it an appropriate choice for a problem or not.

Credits


Amazon Study Guide for the AWS Machine Learning Speciality exam
Reviews
Amazon Study Guide review – AWS Certified Machine Learning Specialty

This Amazon Study Guide review is a review of the official Amazon study guide to accompany the exam. The study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic…

Pluralsight AWS Certified Machine Learning web page screen shot
Reviews
Pluralsight review – AWS Certified Machine Learning Specialty

Contains affiliate links. If you go to Whizlab’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support. The AWS Certified Machine Learning Specialty learning path from Pluralsight has six high quality video courses taught by expert instructors. Two are introductory…


Similar Posts

One Comment

  1. Pingback: bahis siteleri

Leave a Reply

Your email address will not be published. Required fields are marked *