Photograph of a senior lady reading a book to two young boys to symbolize Supervised Learning for Machine Learning

Supervised Learning for Machine Learning

What is Supervised Learning?

For Supervised Learning you need labeled training data. In Supervised Learning we provide data that has already been identified and therefore labeled, as being what we are looking for. Once the Machine Learning model has been trained it can then be presented with real unknown data to which the Machine Learning Model can select a correct label. 

These revision notes are part of sub-domain 3.1, Frame business problems as machine learning problems, of the Modeling domain of the AWS Machine Learning Speciality exam. A description of all the knowledge domains in the exam is in these revision notes: AWS Machine Learning exam syllabus

Questions

To confirm your understanding scroll to the bottom of the page for questions and answers.

How is Supervised Learning used?

You could, for example, train an algorithm to recognise what healthy fruit looked like and the appearance diseased of diseased fruit. This could be achieved by using training data with pictures of healthy and diseased fruit that had been labelled manually by people. People will have to be trained to identify diseased fruit and then apply the correct labels. To help with this task Amazon provide two services:

Once the model is trained it can be given images of fruit to determine if they are healthy or diseased.

The advantages and disadvantages of Supervised Learning?

Supervised learning allows you to predict future events from accumulated historical data. Performance criteria can be optimised using experience. It is usually less computationally intensive since the models can be simpler. The training often requires significant resources due to the quantity of data. The selection of good quality training data is important for performance. However this can lead to the introduction of bias and the risk of over fitting. It is a popular method of solving real world problems.

The applications of Supervised Learning?

Here are some examples of the applications of Supervised Learning:

  • Image recognition
  • Predictive analytics. This helps businesses to forecast future events based on the accumulated historical data.
  • Customer sentiment analysis. This is the study of text communication from customers to determine their sentiment from the words and phrases they use. Amazon Comprehend is an AI service that also provides sentiment analysis
  • Spam detection. Spam emails are detected and directed to a spam directory based on actions performed on similar emails in the past.

There are two main types of Supervised Learning:

  • Classification
  • Regression
Infographic showing Supervised learning, Classification and regression for the AWS Machine Learning Specialty exam
Add this infographic to your Pinterest account

These revision notes are part of the sub-domain 3.1, Frame business problems as machine learning problems, Domain 3 Modeling, of the AWS Machine Learning Specialty exam. A description of all the knowledge domains in the exam is in these revision notes: AWS Machine Learning exam syllabus

Video: Classification and Regression in Machine Learning

This is a brief introduction of Supervised Learning by Max Margenot from Quantopian. It is 2.48 minutes long.

Classification Machine Learning

What is Classification?

Classification is a type of Supervised Learning that results in data receiving a specific label to identify it as being a member of a class. There are two types of classification:

  1. Binary classification – two choices or labels
  2. Multiclass classification – more than two labels

In Binary Classification there are two choices or labels. An example of binary classification is a model that can detect the presence or absence of disease in a type of fruit. In Multiclass Classification there are more than two choices or labels. An example of a multiclass classification model is one that can identify the specific disease, or disease type.

Examples of classes used in Classification

These are some examples of classes that can be used with a Classification model. This group are for binary classification where the data is classified in to one of two classes:

  • Identification of Male or Female responders to comments on a e-commerce website.
  • classification of spam email and non spam email
  • positive and negative sentiment in text messages or twitter feeds

This group are for multi-class classification where the data is classified in to more than two classes:

  • classification of types of soil
  • classification of types of crops
  • classification of mood/feelings in songs/music

Applications of Classification Machine Learning

Some applications of Classification are:

  • speech recognition
  • handwriting recognition
  • biometric identification
  • document classification

What SageMaker algorithms use Classification?

Recommendation systems

Recommendation is a type of Supervised Learning built on top of classification. Recommendations are provided by many of the web based services we interact with, for example:

  • Google play
  • Pinterest
  • YouTube
  • Netflix

Netflix analyzes your previous viewing history to predict films that you are likely to want to watch. This is a Supervised Machine Learning process. All the films have associated metadata, information that describes the film. This may include genre, Age rating, starring actors etc. These are multiclass classifications that are used by the ML algorithms to match, score and rank films to suggest to their customers. Amazon Personalize is AI service that provides personalized recommendations.

What SageMaker algorithms support recommendations?

Regression for Machine Learning

Regression is a supervised learning technique which helps in finding the correlation between variables and enables us to predict the continuous output variable based on the one or more predictor variables.

A regression problem is when the output variable is a real value such as dollars or weight. The model attempts to find a relationship between dependent and independent variables. The dependent variable is the output. The independent variable is the input. Therefore the output depends on the independent input. These types of values are described as continuous variables because they can be as low as zero, or as high as infinity and assume any value in between. In Regression negative values are often changed to be positive ones to prevent negative predictions. For example temperature in Fahrenheit, which can contain negative values, could be converted to the Kelvin range which only contains positive values.

There are many types of regression, for example:

  • Linear
  • Polynomial
  • Multiple
  • Logistic

Video: Making Friends with Machine Learning: Regression

This is a 19 minute video by Cassie Kozyrkov from Google. If you need a gentle high level introduction to Regression this is a good place to start.

Linear Regression

Linear regression is a common type of regression used in Machine Learning. The Linear Regression algorithm works by finding the line that is closest to the data points on average. Once this line in known predictions can be made by finding the output from a supplied input using the line. More complex relationships, or graph shapes, can be described by using other Regression techniques.

What is Time Series Forecasting?

Time series analysis is built on top of Regression. Time series data uses time to add order and structure to the data. So the observations are dependent on the time. Time series analysis uses historical data to predict future events. For example:

  • Forecasting crop yields
  • Forecasting commodities such as gasoline prices
  • Stock Market forecasting
A photo of  a strawberry spraying machine spraying strawberry plants in a poly tunnel table top system used to illustrate Regression type of Unsupervised Machine Learning
A strawberry spraying machine spraying strawberry plants in a poly tunnel table top system in the UK.

This is an example of regression and time series data. In this case the time period is in years. The line was created by using the Trend Line feature of Microsoft Excel. The data came from a UK government website providing historical data on agriculture. The graph shows the increasing weight of pesticides in strawberry cultivation over time.

An image of a graph showing pesticide use in Strawberry cultivation.. The x axis is time in years and the y axis Kg of pesticide per hectare. A line created from Regression shows the increase of pesticides over time.

The equation of the line on the graph is like this:

y = ax + b

This simplicity begins to explain why Supervised Learning may require large resources in the training phase, to create the equation, followed by smaller resources in production. This also explains how a Machine Learning model can be installed on edge devices that have constrained resources.

What SageMaker algorithms support Regression?

Summary

Supervised Learning is Machine Learning with labeled data. The supervision is the labels and the assessment in training that measures inferences against the known outcome. There are two main types of Supervised Learning: Classification and Regression.

If you are reading these notes in order the previous revision notes were Problem Framing for Machine Learning. The next revision notes, when they are published, will be about Unsupervised Learning.

Credits


AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam

This study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic flashcards, and supplementary online resources. It is available in both paper and kindle version for immediate access. (Vist Amazon books)


59
Created on By Michael Stainsbury

3.1 Supervised Learning for Machine Learning (full)

This quiz is for sub-domain 3.1, Frame business problems as machine learning problems, of the Modeling domain.

1 / 5

2 / 5

In Supervised Learning, what types of Classification are there?

3 / 5

Does Supervised Learning require labeled or unlabeled data?

4 / 5

What AWS services can help with labelling?

5 / 5

In Supervised Learning what types of Classification () are there?

Your score is

The average score is 85%

0%


Amazon Study Guide for the AWS Machine Learning Speciality exam
Reviews
Amazon Study Guide review – AWS Certified Machine Learning Specialty

This Amazon Study Guide review is a review of the official Amazon study guide to accompany the exam. The study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic…

Pluralsight AWS Certified Machine Learning web page screen shot
Reviews
Pluralsight review – AWS Certified Machine Learning Specialty

Contains affiliate links. If you go to Whizlab’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support. The AWS Certified Machine Learning Specialty learning path from Pluralsight has six high quality video courses taught by expert instructors. Two are introductory…


Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *