Problem Framing for Machine Learning
Problem Framing is a method used to understand, define and prioritize business problems. It is one of the most important phases in Machine Learning that will determine if all the work that is to be done subsequently is perceived to be of use and provides business value. Framing determines what will be observed and what will be predicted. It also identifies the metrics that need to be optimised to monitor performance and errors.
This Study Guide is for sub-domain: 3.1, Frame business problems as machine learning problems, of the Modeling domain of the AWS Machine Learning Speciality exam. A description of all the knowledge domains in the exam is in: AWS Machine Learning exam syllabus
Questions
Scroll to the bottom of the page for questions and answers.
Overview
Problem Framing is preceded by Business Goal Identification. This will provide a clear idea of the problem to be solved and the business value to be derived. This is not a unique ML process, it has been adopted by ML to enable business problems to be phrased so they are suitable for Machine Learning solutions. Framing presents a problem in a way to win support from stakeholders, whilst maximising the chances of a ML solution being successful.
In How To Frame A Problem To Find The Right Solution Paloma Cantero-Gomez offers techniques to help in Problem Framing:
Forbes: How To Frame A Problem To Find The Right Solution
- The 40-20-10-5 rule to sequentially cut out words to reach a concise description.
- Rephrase and focus this mostly reframes in a more strategic way.
- Challenge assumptions – what do we know to be true and what have we assumed to be true.
- Change the perspective of the problem by using people from outside of the business area to comment on the problem and so gain a new perspective.
The Google course Introduction to Machine Learning Problem Framing provides a suggested approach to Problem Framing:
Formulate Your Problem as an ML Problem bookmark_border
- Articulate your problem.
- Start simple.
- Identify Your Data Sources.
- Design your data for the model.
- Determine where data comes from.
- Determine easily obtained inputs.
- Ability to Learn.
- Think About Potential Bias.
Video: Coursera: Lesson 3: Framing Design Problems:
Lesson 3: Framing Design Problems – Introduction to the Design Process
You may have to register with Coursera to view this video. Registering is free and you are not asked for payment information.
Problem framing best practice

- AWS White Paper in web page form: ML Problem Framing
- AWS White Paper, PDF version: wellarchitected-machine-learning-lens.pdf
The Machine Learning Lens AWS Well-Architected Framework white paper lists six points of best practice:
- Define success criteria
- Establish a performance metric
- Define inputs, outputs and metrics
- Is Machine Learning suitable?
- Data sourcing and annotation objectives
- Select a simple model
Define success criteria
Define criteria for a successful outcome of the project. This comes from defining the Business goal.
Establish a performance metric
Establish an observable and quantifiable performance metric for the project, for example:
- accuracy
- prediction latency
- minimizing inventory value
This will be used to assess if the training was successful and, when in production, allow the model to be monitored.
Inputs, outputs, and performance metrics
Formulate the ML question in terms of inputs, desired outputs, and the performance metric to be optimized.
Is Machine Learning suitable?
Evaluate whether ML is a feasible and appropriate approach. Machine Learning inference works best with large amounts of data that cannot be easily understood, or valuable data points to be identified. The analysis performed so far may indicate that the objectives can be achieved without a Machine Learning process.
Create a data sourcing and annotation objective
Create a data sourcing and data annotation objective, and a strategy to achieve it. Identify where the data will come from and how clean it will be. This will allow you to estimate how much effort will be required to set up a ML Pipeline. You may be able to identify time and cost savings by leaving data out. Issues that cause increased effort to process are:
- Data cleansing
- Data conversion
- Complex data ingestion, or transport
- Personally identifiable information (PII)
- Data requiring enhanced security
- Massive data volumes
- The cost to label
PII and enhanced security are expensive because of regulatory requirements. For labeling costs, some data may already be labelled, or easily labeled. The most expensive labeling is manual labeling.
Select a simple model
Start with a simple model that is easy to interpret and implement. More complex models are harder and slow to train than simple models. So by using a simpler model training iterations can be faster which makes debugging more manageable. A more complex model can be introduced later if it is justified.
Summary
Problem framing is one of the most important phases of the Machine Learning process and one that you have to get right. Everything that comes after relies on the information that comes out of this phase. If you get Framing wrong you can end up doing a lot of work that nobody wants and does not deliver business value.
I recommend you study the Google course Introduction to Machine Learning Problem Framing. It is very good, free and only takes about an hour to complete.
Introduction to Machine Learning Problem Framing
Credits
Photo by Pineapple Supply Co. on Unsplash
Contains affiliate links. If you go to Whizlab’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support.
Whizlabs AWS Certified Machine Learning Specialty
Practice Exams with 271 questions, Video Lectures and Hands-on Labs from Whizlabs
Whizlab’s AWS Certified Machine Learning Specialty Practice tests are designed by experts to simulate the real exam scenario. The questions are based on the exam syllabus outlined by official documentation. These practice tests are provided to the candidates to gain more confidence in exam preparation and self-evaluate them against the exam content.
Practice test content
- Free Practice test – 15 questions
- Practice test 1 – 65 questions
- Practice test 2 – 65 questions
- Practice test 3 – 65 questions
Questions and answers
Whizlab’s AWS Certified Machine Learning Specialty course
- In Whizlabs AWS Machine Learning certification course, you will learn and master how to build, train, tune, and deploy Machine Learning (ML) models on the AWS platform.
- Whizlab’s Certified AWS Machine Learning Specialty practice tests offer you a total of 200+ unique questions to get a complete idea about the real AWS Machine Learning exam.
- Also, you get access to hands-on labs in this course. There are about 10 lab sessions that are designed to take your practical skills on AWS Machine Learning to the next level.

Course content
The course has 3 resources which can be purchased seperately, or together:
- 9 Practice tests with 271 questions
- Video course with 65 videos
- 9 hands on labs