Image of a child under three years old reading a fruit alphabet book to symbolize Unsupervised Learning

SageMaker unsupervised algorithms

There are five SageMaker unsupervised algorithms that process tabular data. Unsupervised Learning algorithms process data that has not been labeled. IP Insights is an anomaly detection algorithm to detect problems and threats in an IR network. K-Means is a clustering algorithm. Object2Vec translates input data to vectors. Principal Component Analysis (PCA) algorithm is used in Feature Engineering to reduce the number of features in data. The Random Cut Forest (RCF) is a general purpose anomaly detection algorithm.

Modeling (Domain 3)

K-Means Algorithm

The K-Means Algorithm is an Unsupervised Learning algorithm used to find clusters. The clusters are formed by grouping data points that are as similar as possible to each other and different from other data points. The distance between data points are calculated and averaged to form groups. K-Means is used for market segmentation, computer vision,…

A photo of coffee being dripped into a flask from a paper filter symbolising PCA Principal Component Analysis Algorithm
Modeling (Domain 3)

Principal Component Analysis Algorithm

Sometimes data can have large amounts of features, so many that further processing or inference can be hampered. When this occurs Principal Component Analysis Algorithm (PCA), an Unsupervised Learning algorithm, is used to reduce the number of features whilst retaining as much information as possible. This is Feature Engineering. PCA has two modes: Regular and…

A photograph of IT network cables and sockets to symbolize the SageMaker built in algorithm IP Insights
Modeling (Domain 3)

IP Insights Algorithm

SageMaker IP Insights Algorithm is used for detecting anomalies in network traffic. It is an unsupervised learning algorithm that is trained on historical data to learn the patterns of normal network usage. In production it can detect anomalies in network usage that may indicate changes in user behaviour, network performance or malicious activity.  The IP…

Summary

These SageMaker built-in algorithms all use Unsupervised Learning and so process unlabelled data. Both IP Insights and Random Cut Forest (RCF) are used for anomaly detection. Object2Vec translates data into vectors to be used by downstream processing. K-Means identifies clusters in data. PCA reduces the numbers of features in high dimensional data as part of Feature Engineering.

Credits

Whizlab’s AWS Certified Machine Learning Specialty practice exams

Practice Exams with 271 questions, Video Lectures and Hands-on Labs from Whizlabs

Whizlab’s AWS Certified Machine Learning Specialty Practice tests are designed by experts to simulate the real exam scenario. The questions are based on the exam syllabus outlined by official documentation. These practice tests are provided to the candidates to gain more confidence in exam preparation and self-evaluate them against the exam content.

Practice test content

  • Free Practice test – 15 questions
  • Practice test 1 – 65 questions
  • Practice test 2 – 65 questions
  • Practice test 3 – 65 questions
Whizlabs AWS certified machine learning course with a robot hand

Section test content

  • Core ML Concepts – 10 questions
  • Data Engineering – 11 questions
  • Exploratory Data Analysis – 13 questions
  • Modeling – 15 questions
  • Machine Learning Implementation and Operations – 12 questions

Similar Posts