Image of a child under three years old reading a fruit alphabet book to symbolize Unsupervised Learning

SageMaker unsupervised algorithms

There are five SageMaker unsupervised algorithms that process tabular data. Unsupervised Learning algorithms process data that has not been labeled. IP Insights is an anomaly detection algorithm to detect problems and threats in an IR network. K-Means is a clustering algorithm. Object2Vec translates input data to vectors. Principal Component Analysis (PCA) algorithm is used in Feature Engineering to reduce the number of features in data. The Random Cut Forest (RCF) is a general purpose anomaly detection algorithm.

Modeling (Domain 3)

K-Means Algorithm

The K-Means Algorithm is an Unsupervised Learning algorithm used to find clusters. The clusters are formed by grouping data points that are as similar as possible to each other and different from other data points. The distance between data points are calculated and averaged to form groups. K-Means is used for market segmentation, computer vision,…

Photo of lady with shopping approaching a car symbolizing Object2Vec Algorithm
Modeling (Domain 3)

Object2Vec Algorithm

Object2Vec Algorithm is an Unsupervised Learning algorithm. The algorithm compares pairs of data points and preserves the semantics of the relationship between the pairs. The algorithm creates embeddings that can be used by other algorithms downstream. The embeddings are low-dimensional dense embeddings of high-dimensional objects. Object2Vec can be used for product search, item matching and…

A photograph of IT network cables and sockets to symbolize the SageMaker built in algorithm IP Insights
Modeling (Domain 3)

IP Insights Algorithm

SageMaker IP Insights Algorithm is used for detecting anomalies in network traffic. It is an unsupervised learning algorithm that is trained on historical data to learn the patterns of normal network usage. In production it can detect anomalies in network usage that may indicate changes in user behaviour, network performance or malicious activity.  The IP…

Summary

These SageMaker built-in algorithms all use Unsupervised Learning and so process unlabelled data. Both IP Insights and Random Cut Forest (RCF) are used for anomaly detection. Object2Vec translates data into vectors to be used by downstream processing. K-Means identifies clusters in data. PCA reduces the numbers of features in high dimensional data as part of Feature Engineering.

Credits

AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam

Contains affiliate links. If you go to Amazon’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support.

This study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic flashcards, and supplementary online resources.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *