Canary deployment, Blue/Green deployment and A/B testing compared
Overview
Canary deployment, Blue/Green deployment and A/B testing are methods to control risk. SageMaker Endpoints support these methods.
Canary deployment
- AWS docs: Canary deployment
- Study Guide: Deploy and operationalize machine learning solutions
A Canary release is a very risk averse deployment strategy. It involves directing a small proportion of the live traffic to the new production variant and checking that everything works as expected. The proportion of live traffic is gradually increased until all the traffic is being directed to the new production variant at which point the previous version can be removed. If any issues are identified live traffic can be switched back to the original production variant.
Blue/Green deployment
- AWS docs: Blue/Green Deployments
- Study Guide: Deploy and operationalize machine learning solutions
Blue Green deployments have two phases. In the first Phase the new variant is deployed in an identical environment to the production variant. It is then fed synthetic data and monitoring metrics are checked. In the second phase live traffic is switched to the new variant and the metrics are compared with those produced by the current production variant. If a problem is identified all live traffic is switched back to the production variant, otherwise the new variant becomes the new production variant and the old one is removed.
A/B testing
- AWS docs: A/B Testing
- Study Guide: Deploy and operationalize machine learning solutions
In Machine Learning A/B testing is used to compare the performance of a new model variant with the current one. In SageMaker the proportion of traffic split between the two model variants is configurable using the variant weight. Using this feature more than two variants could be tested at the same time if desired.
Comparison
Description | Canary | Blue / Green | A / B |
---|---|---|---|
Strategy type | deployment | deployment | testing, experimentation |
Location | production | production (two environments) | production |
Cost | cheap (one environment) | expensive (two environments) | cheap (one environment) |
Risk | very low risk because only a small % deployed | low risk because because the deployment can be rapidly reverted | small risk of breaking a production service |
System types | mission critical systems | mission critical systems | any system |
Implementation complexity | Complex | Simple | Complex |
Credits
Photo by Kaikara Dharma on Unsplash
AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam
This study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic flashcards, and supplementary online resources.