Problem Framing for Machine Learning
Problem Framing is a method used to understand, define and prioritize business problems. It is one of the most important phases in Machine Learning that will determine if all the work that is to be done subsequently is perceived to be of use and provides business value. Framing determines what will be observed and what will be predicted. It also identifies the metrics that need to be optimised to monitor performance and errors.
This Study Guide is for sub-domain: 3.1, Frame business problems as machine learning problems, of the Modeling domain of the AWS Machine Learning Speciality exam. A description of all the knowledge domains in the exam is in: AWS Machine Learning exam syllabus
Scroll to the bottom of the page for questions and answers.
Problem Framing is preceded by Business Goal Identification. This will provide a clear idea of the problem to be solved and the business value to be derived. This is not a unique ML process, it has been adopted by ML to enable business problems to be phrased so they are suitable for Machine Learning solutions. Framing presents a problem in a way to win support from stakeholders, whilst maximising the chances of a ML solution being successful.
In How To Frame A Problem To Find The Right Solution Paloma Cantero-Gomez offers techniques to help in Problem Framing:
Forbes: How To Frame A Problem To Find The Right Solution
- The 40-20-10-5 rule to sequentially cut out words to reach a concise description.
- Rephrase and focus this mostly reframes in a more strategic way.
- Challenge assumptions – what do we know to be true and what have we assumed to be true.
- Change the perspective of the problem by using people from outside of the business area to comment on the problem and so gain a new perspective.
The Google course Introduction to Machine Learning Problem Framing provides a suggested approach to Problem Framing:
Formulate Your Problem as an ML Problem bookmark_border
- Articulate your problem.
- Start simple.
- Identify Your Data Sources.
- Design your data for the model.
- Determine where data comes from.
- Determine easily obtained inputs.
- Ability to Learn.
- Think About Potential Bias.
Video: Coursera: Lesson 3: Framing Design Problems:
Lesson 3: Framing Design Problems – Introduction to the Design Process
You may have to register with Coursera to view this video. Registering is free and you are not asked for payment information.
Problem framing best practice
- AWS White Paper in web page form: ML Problem Framing
- AWS White Paper, PDF version: wellarchitected-machine-learning-lens.pdf
The Machine Learning Lens AWS Well-Architected Framework white paper lists six points of best practice:
- Define success criteria
- Establish a performance metric
- Define inputs, outputs and metrics
- Is Machine Learning suitable?
- Data sourcing and annotation objectives
- Select a simple model
Define success criteria
Define criteria for a successful outcome of the project. This comes from defining the Business goal.
Establish a performance metric
Establish an observable and quantifiable performance metric for the project, for example:
- prediction latency
- minimizing inventory value
This will be used to assess if the training was successful and, when in production, allow the model to be monitored.
Inputs, outputs, and performance metrics
Formulate the ML question in terms of inputs, desired outputs, and the performance metric to be optimized.
Is Machine Learning suitable?
Evaluate whether ML is a feasible and appropriate approach. Machine Learning inference works best with large amounts of data that cannot be easily understood, or valuable data points to be identified. The analysis performed so far may indicate that the objectives can be achieved without a Machine Learning process.
Create a data sourcing and annotation objective
Create a data sourcing and data annotation objective, and a strategy to achieve it. Identify where the data will come from and how clean it will be. This will allow you to estimate how much effort will be required to set up a ML Pipeline. You may be able to identify time and cost savings by leaving data out. Issues that cause increased effort to process are:
- Data cleansing
- Data conversion
- Complex data ingestion, or transport
- Personally identifiable information (PII)
- Data requiring enhanced security
- Massive data volumes
- The cost to label
PII and enhanced security are expensive because of regulatory requirements. For labeling costs, some data may already be labelled, or easily labeled. The most expensive labeling is manual labeling.
Select a simple model
Start with a simple model that is easy to interpret and implement. More complex models are harder and slow to train than simple models. So by using a simpler model training iterations can be faster which makes debugging more manageable. A more complex model can be introduced later if it is justified.
Problem framing is one of the most important phases of the Machine Learning process and one that you have to get right. Everything that comes after relies on the information that comes out of this phase. If you get Framing wrong you can end up doing a lot of work that nobody wants and does not deliver business value.
I recommend you study the Google course Introduction to Machine Learning Problem Framing. It is very good, free and only takes about an hour to complete.
Introduction to Machine Learning Problem Framing
Photo by Pineapple Supply Co. on Unsplash
AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam
This study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic flashcards, and supplementary online resources. It is available in both paper and kindle version for immediate access. (Vist Amazon books)
Questions and answers
Amazon Study Guide review – AWS Certified Machine Learning Specialty
This Amazon Study Guide review is a review of the official Amazon study guide to accompany the exam. The study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic…
Pluralsight review – AWS Certified Machine Learning Specialty
Contains affiliate links. If you go to Pluralsight’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support. The AWS Certified Machine Learning Specialty learning path from Pluralsight has six high quality video courses taught by expert instructors. Two are introductory…
Whizlabs review – AWS Certified Machine Learning Specialty
Need more practice with the exams? Check out Whizlab’s free test with 15 questions. They also have three practice tests (65 questions each) and five section tests (10-15 questions each). Money off promo codes are below. For the AWS Certified Machine Learning Specialty Whizlabs provides a practice tests, a video course and hands-on labs. These…