Exploratory Data Analysis
In this domain the data is analysed so it can be understood and cleaned up. It comprises 24% of the exam marks. Domain 2 is Exploratory Data Analysis, there are three subdomains:
- 2.1 Sanitize and prepare data for modeling
- 2.2 Perform feature engineering
- 2.3 Analyze and visualize data for machine learning
Analysing and visualising the data (subdomain 2.3) overlaps with the other two sub-domains which use these techniques. The techniques include graphs, charts and matrices. Before data can be sanitized and prepared (subdomain 2.1) it has to be understood. This is done using statistics that focus on specific aspects of the data and graphs and charts that allow relationships and distributions to be seen. The data can then be cleaned using techniques to remove distortions and fill in gaps. Feature Engineering (subdomain 2.2) is about creating new features from existing ones to make the ML algorithms more powerful. Techniques are used to reduce the number of features and categorise the data.
When the data is understood and has been cleaned it is ready for the next stage, modeling.
- For description of the exam structure see this article: AWS Machine Learning exam syllabus.
- The AWS exam guide pdf can be downloaded from: https://d1.awsstatic.com/training-and-certification/docs-ml/AWS-Certified-Machine-Learning-Specialty_Exam-Guide.pdf
Contains affiliate links. If you go to Whizlab’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support.
Whizlabs AWS Certified Machine Learning Specialty
Practice Exams with 271 questions, Video Lectures and Hands-on Labs from Whizlabs
Whizlab’s AWS Certified Machine Learning Specialty Practice tests are designed by experts to simulate the real exam scenario. The questions are based on the exam syllabus outlined by official documentation. These practice tests are provided to the candidates to gain more confidence in exam preparation and self-evaluate them against the exam content.
Practice test content
- Free Practice test – 15 questions
- Practice test 1 – 65 questions
- Practice test 2 – 65 questions
- Practice test 3 – 65 questions
Sample Exploratory Data Analysis questions
This test is five questions randomly taken from the questions in the tests of the three subdomains.
Study guides for exploratory data analysis

Data cleansing and preparation for modeling
Understanding data, cleansing data and dataset generation are important first steps in exploratory data analysis. Every other phase in the Machine Learning process relies on the data being cleaned and prepared. This Study Guide starts with statistical techniques used to help understand the data. Once data is understood it has to be cleaned up so…

Feature Engineering for Machine Learning
Feature Engineering is the process of creating new features from the original ones to make the prediction power of the chosen algorithm more powerful. This article explains the concepts of Feature Engineering and the techniques to use for Machine Learning.

Data visualization for Machine Learning
These Revision Notes describe graphs used for data visualization in Machine Learning

Whizlabs review – AWS Certified Machine Learning Specialty
Need more practice with the exams? Check out Whizlab’s free test with 15 questions. They also have three practice tests (65 questions each) and five section tests (10-15 questions each). Money off promo codes are below. For the AWS Certified Machine Learning Specialty Whizlabs provides a practice tests, a video course and hands-on labs. These…
Credits: Photo by Jamie Street on Unsplash