Exploratory Data Analysis
In this domain the data is analysed so it can be understood and cleaned up. It comprises 24% of the exam marks. Domain 2 is Exploratory Data Analysis, there are three subdomains:
- 2.1 Sanitize and prepare data for modeling
- 2.2 Perform feature engineering
- 2.3 Analyze and visualize data for machine learning
Analysing and visualising the data (subdomain 2.3) overlaps with the other two sub-domains which use these techniques. The techniques include graphs, charts and matrices. Before data can be sanitized and prepared (subdomain 2.1) it has to be understood. This is done using statistics that focus on specific aspects of the data and graphs and charts that allow relationships and distributions to be seen. The data can then be cleaned using techniques to remove distortions and fill in gaps. Feature Engineering (subdomain 2.2) is about creating new features from existing ones to make the ML algorithms more powerful. Techniques are used to reduce the number of features and categorise the data.
When the data is understood and has been cleaned it is ready for the next stage, modeling.
- For description of the exam structure see this article: AWS Machine Learning exam syllabus.
- The AWS exam guide pdf can be downloaded from: https://d1.awsstatic.com/training-and-certification/docs-ml/AWS-Certified-Machine-Learning-Specialty_Exam-Guide.pdf
AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam
This study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic flashcards, and supplementary online resources. It is available in both paper and kindle version for immediate access. (Vist Amazon books)
Sample Exploratory Data Analysis questions
This test is five questions randomly taken from the 15 questions in the tests of the three subdomains.
Study guides for exploratory data analysis
Feature Engineering is the process of creating new features from the original ones to make the prediction power of the chosen algorithm more powerful. This article explains the concepts of Feature Engineering and the techniques to use for Machine Learning.
This Amazon Study Guide review is a review of the official Amazon study guide to accompany the exam. The study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic…
Contains affiliate links. If you go to Whizlab’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support. The AWS Certified Machine Learning Specialty learning path from Pluralsight has six high quality video courses taught by expert instructors. Two are introductory…
Understanding data, cleansing data and dataset generation are important first steps in exploratory data analysis. Every other phase in the Machine Learning process relies on the data being cleaned and prepared. This Study Guide starts with statistical techniques used to help understand the data. Once data is understood it has to be cleaned up so…
Need more practice with the exams? Check out Whizlab’s free test with 15 questions. They also have three practice tests (65 questions each) and five section tests (10-15 questions each). Money off promo codes are below. For the AWS Certified Machine Learning Specialty Whizlabs provides a practice tests, a video course and hands-on labs. These…
These Revision Notes describe graphs used for data visualization in Machine Learning