AWS security for machine learning
Security is a vast subject and AWS even have their own Professional level certificate exam on this subject. Using the AWS course: Exam Readiness: AWS Certified Machine Learning – Specialty as a guide these revision notes give an overview of the main AWS security service Identity and Access Management (IAM) and then highlight security features and requirements specific to SageMaker.
Scroll to the bottom of the page for questions and answers test app.
There are four curated videos in the revision notes. The last video is An Overview of Amazon SageMaker Security which is an AWS video by Tom Faulhaber. This is a great video that covers most of the content for this subdomain.
Curated videos
- Video – Secure your workloads with NAT Gateway
- IAM federation
- Video – What is an Interface VPC Endpoint and how can I create Interface Endpoint for my VPC?
- Video – An Overview of Amazon SageMaker Security (Level 100)
Security is an important activity to protect your Machine Learning processes and data. AWS provides security services and tools as part of the Shared Responsibility Model. This is where AWS provides security of the cloud, but you have to provide security in the cloud.
- AWS docs: https://aws.amazon.com/compliance/shared-responsibility-model/
- AWS workshop: https://sagemaker-workshop.com/security_for_sysops.html
This Study Guide cover the content for sub-domain 4.3 Apply basic AWS security practices to machine learning solutions of the Machine Learning Implementation and Operations knowledge domain. For more information about the exam structure see: AWS Machine Learning exam syllabus
IAM
What is IAM?
Identity and Access Management (IAM) is the AWS security service that protects our data, processes and communication in the cloud. IAM is part of the locked door through which we enter to work, and bad people are kept out. IAM allows other people and systems to trust us with their data. IAM is also the evergreen task that is always at the top of our to do list not matter what you do in AWS. So whatever task you need to do, or service you wish to use, some knowledge of IAM is important.
What are IAM Users Groups, Roles and Policies?
IAM has Users, Groups and Roles:
- Users – a User is … well … you and me, real people. We have names and email addresses and we can be issued with user names and passwords.
- Groups – Users can be grouped together with others that will be doing similar tasks in AWS.
- Roles – allow you to give access privileges to AWS services, that is, not people.
- Policies – contain permissions that allow you to do things or be prevented from doing things.
So it works like this:
- Develop a Policy
- Add the policy to a Group. The Group will contain people with similars tasks for example DevOps, Data Analysts, Data Scientists.
- Add a User to the Group. The User now has all the privileges that are in the Policy you wrote because it is attached to the Group.
This is great for managing large numbers of users. If people move between jobs, you can detach them from one Group and add them to another one. If you need to add a new permission, for example to access a service that you organisation has agreed to use and fund, the Policy can be updated. This will immediately allow any User that belongs to the Group to which the Policy is attached to use the new service.
All this creating, attaching and detaching can be done through the AWS console, CloudFormation or AWS CLI.
Roles are similar to Groups and allow privileges to be granted to AWS services. For example Amazon SageMaker Notebooks have a role that contains permissions that control what the Notebook can access. Note that in more sophisticated security architectures User permissions are kept to a minimum and the User assumes a Role on login which provides them with the privileges they need.
What does a Policy look like?
Policies are the core of IAM. They define in specific detail what can, or cannot be done. They are written in YAML, or JSON. These two languages are interchangeable and it is a matter of personal choice which one you use. There are two types of Policies:
- AWS managed
- Custom
AWS managed Policies are created and managed by AWS. They have useful names that describe what they do, for example:
- AmazonS3ReadOnlyAccess
- AmazonRedshiftQueryEditor
- AmazonSageMakerReadOnly
This means you do not have to look inside to guess what they do. You can add many Policies to build up the access profile you need.
The advantage of user Managed Policies is that AWS automatically updates then when services are changed, or new features are added. The disadvantage is that AWS may change the Policy without you knowing to give access to new features.
Anatomy of a policy
- AWS docs: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/iam-policy-structure.html
- Good explanation with examples: https://start.jcolemorrison.com/aws-iam-policies-in-a-nutshell/
This is the structure of a Policy:
{
"Statement":[{
"Sid": "statement ID and description"
"Effect":"effect",
"Action": [
"action1",
"action2"
],
"Resource":"arn",
"Condition":{
"condition":{
"key":"value"
}
}
}
]
}
- Statement, This is a list of Policy statements
- Sid, This is an optional ID for a Statement. This is very useful for large policies.
- Effect, This is what the Statement does, it can Allow or Deny
- Action, This is a list of the detailed actions the Statement is allowing, or denying
- Resource, This is the AWS resource that the Statement acts on, for example a S3 bucket name
- Condition, Any conditions that must be satisfied for the Statement to be effective.
Infrastructure security on Amazon SageMaker
Infrastructure are the services that make up the AWS cloud in which you work. Here are some services that you will need to be aware of and know how they are secured:
- Virtual Private Cloud (VPC)
- Security Group
- Network Address Translation Gateway (NAT)
- Internet Gateway
- Simple Storage Service (S3)
What is a VPC?

- AWS docs: https://aws.amazon.com/vpc/
The VPC ( Virtual Private Cloud) is your virtual network in the cloud. It contains all the features that you would expect to find on a network on a physical premises. The VPC contains subnets that can be public to the internet or private and security to protect it. Security at the network level is provided by NACLs (Network Access Control List) and the instance level by Security Groups. They do similar jobs in slightly different ways and allow you to restrict access to traffic coming from specific IP addresses, or ranges, and protocols. For example you could lock down access to a single PC via it’s IP address and then only if it uses https protocol.
How does a Security Group work?
A Security Group acts as a virtual firewall for instances. They control incoming and outgoing traffic using separate rules for each. The rules enable you to filter traffic based on protocols and port numbers. Security Group rules allow access, rules cannot deny access.
How does a NAT Gateway improve security?
A NAT (Network Address Translation) Gateway makes it easy to connect to the Internet from instances within a private subnet in an AWS Virtual Private Cloud (VPC). A NAT enables instances in a private subnet to connect to the internet, but prevents hosts on the internet from initiating connections with the instances.
Video – Secure your workloads with NAT Gateway
What is an Internet Gateway?
An Internet Gateway connects a subnet in a VPC to the internet. The definition of a public subnet is one that has an Internet Gateway.
What security does S3 have?
The contents of the S3 bucket can be encrypted by KMS encryption. This encryption can be enforced on upload so that all the contents of the bucket are encrypted.
There two types of security policies for S3:
- Resource based policies are features of the S3 bucket. They include Access Control Lists (ACL) and bucket policies.
- User based policies are IAM policies that can be attached to a User, Group or Role. Since access to S3 buckets is default deny, these policies usually explicitly allow access.
AWS KMS

- AWS docs: https://aws.amazon.com/kms/
- AWS FAQs: https://aws.amazon.com/kms/faqs/
What types of KMS are there?
There are three types of encryption used with Machine Learning. They are all Server Side Encryption methods, differing on how the key is managed:
- Server-Side Encryption with S3 Managed Keys (SSE-S3)
- Server-Side Encryption with KMS Managed Keys (SSE-KMS)
- Server-Side Encryption with Customer Provided Keys (SSE-C)
Security on Amazon SageMaker
Access control
SageMaker controls access to it’s Notebooks. There are two types of Notebooks:
- SageMaker Notebooks
- Studio Notebooks
In SageMaker Notebooks users have root access by default, so they have administrator privileges. This root access can be disabled. In SageMaker Studio access control and isolation is achieved by using filesystem and container permissions.
Data Protection
Data protection at rest
By default SageMaker uses AWS KMS with an AWS managed customer master key (CMK) for:
- Notebooks
- Training jobs
- Amazon S3 location to store models Endpoint
Data protection in motion
Communication between components inside the SageMaker managed environment is usually unencrypted to prevent performance degradation due the the time spent encrypting and decrypting. Data protection of data during transmission out side the SageMaker managed environment is achieved by using HTTPS with TLS certificates for:
- API/console
- Notebooks
- VPC-enabled
- Interface endpoint
- Limit by IP Training jobs/endpoints
IAM for SageMaker
Authentication
Authentication is signing in, authorization refers to permission privileges.
IAM federation
You can also use your company’s single sign-on authentication or even sign in using Google or Facebook. MFA (Multi Factor Authentication) can be set up. This involves entering a code from an app on a mobile phone to verify your identity.
Gaining insight
Restrict access by IAM policy and condition keys
SageMaker Roles
SageMaker may use Roles for different tasks. Depending on your security environment you may use a few very broad Roles to perform all SageMaker tasks. The IAM managed policy AmazonSageMakerFullAccess provides a convenient way to explore SageMaker features for investigation and personal training needs. However for use cases where security has a higher priority there will be Roles developed for specific tasks with specific privileges. These AWS docs have examples of the permissions you can use.
To provide isolation of each Notebook, each user can have a Notebook Role that they assume when they log in.
Because SageMaker has a managed environment that has EC2 instances which it creates and scales it needs the ability to pass Roles to other services such as EC2. The ability to pass a Role and assume a Role without human intervention is considered significant from a security perspective. For this reason, in organisations where security is a high priority, SageMaker may be set up in an AWS account of it’s own with cross account access to data in other AWS accounts. This will isolate the permissions that Sagemaker needs to it’s own environment.
Logging and Monitoring
Amazon CloudWatch is used to monitor SageMaker processing. The CloudWatch Logs enable you to monitor, store, and access your log files from SageMaker events, AWS CloudTrail, and other sources.
Compliance Validation
AWS CloudTrail is used to provide an Audit Trail. CloudTrail logs a record of actions performed in SageMaker and who, or what performed them. CloudTrail captures all API calls for SageMaker, with the exception of InvokeEndpoint, as events.
Compliance programs
Amazon SageMaker has been accessed by third party auditors to confirm compliance with published standards. Below are three standards mentioned in the Exam Readiness course.
- PCI DSS – The Payment Card Industry Data Security Standard (PCI DSS) is an information security standard for organizations that handle branded credit cards from the major card schemes. https://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard
- HIPAA-eligible with BAA – Health Insurance Portability and Accountability, Business Associate Agreement. These are US standards for Medical information privacy. https://en.wikipedia.org/wiki/Medical_privacy
- ISO – International Organization for Standardization https://www.iso.org/home.html
Resilience
Resilience is part of the AWS infrastructure of AWS Regions and Availability Zones (AZ). These provide isolation so that problems in one Region or AZ do not affect another. SageMaker uses multiple AZs for it’s managed environment. For example when you specify more than one SageMaker managed EC2 they are automatically created in separate AZs.
Infrastructure Security
VPCs and endpoints
Connecting to SageMaker through an interface VPC endpoint (interface endpoint) ensures that all data is transmitted within the AWS network without exposing the data to the internet. Exposing data to the internet occurs when you access a service with a URL address. With an endpoint this form of addressing is not used, which results in keeping the data transmission within the AWS network.
SageMaker Notebooks can access the internet to download libraries need to process data for Machine Learning. This is enabled by default as is being created in the SageMaker managed VPC. This internet access may be seen as a vulnerability by your organisation, so there are two actions that can be taken.
- Remove internet access when the Notebook is created. This can be done in CloudFormation.
- Create the Notebook in your own VPC. This enables you to control all the security features to give your Notebooks and data assets the protection you believe they need. Connecting to SageMaker Notebooks via a VPC interface endpoint means that communication between your VPC and the notebook instance is within the AWS network without being exposed to the internet. https://docs.aws.amazon.com/sagemaker/latest/dg/notebook-interface-endpoint.html
SageMaker processing jobs, training jobs, hosted endpoints and batch transform jobs will access your resources, such as data in S3 buckets over the internet. To improve security AWS recommends hosting your data in a private VPC, this is a VPC without access to the internet. SageMaker can access your private VPC via an endpoint which means that all data transmission remains within the AWS environment without any exposure to the internet. https://docs.aws.amazon.com/sagemaker/latest/dg/process-vpc.html
Video – What is an Interface VPC Endpoint and how can I create Interface Endpoint for my VPC?
Scans
- AWS docs: https://docs.aws.amazon.com/sagemaker/latest/dg/infrastructure-security.html#mkt-container-scan
SageMaker automatically scans for Common Vulnerabilities and Exposures (CVE) identified in public vulnerability databases.
Gaining insight
- AWS docs: https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_iam-condition-keys.html
Restrict access by IAM policy and condition keys. An IAM policy can be used to restrict access to SageMaker in general as well as to specific SageMaker services. A Condition Key is logic within an IAM policy that further restricts access at a more granular level. For example using an Amazon Resource Name (ARN), or Service Name to restrict the action of a Role with the policy attached.
Video – An Overview of Amazon SageMaker Security (Level 100)
Summary
This Study Guide has introduced IAM features relevant to securing a Machine Learning environment. SageMaker security requirements and features have more explained in more detail. These revision notes have used the AWS course: Exam Readiness: AWS Certified Machine Learning – Specialty as a guide.
This Study Guide covers sub-domain 4.3 of the Machine Learning Implementation and Operations knowledge domain (domain 4). The four sub-domains are:
- 4.1 Build machine learning solutions for performance, availability, scalability, resiliency, and fault tolerance.
- 4.2 Recommend and implement the appropriate machine learning services and features for a given problem.
- 4.3 Apply basic AWS security practices to machine learning solutions.
- 4.4 Deploy and operationalize machine learning solutions.
If you are progressing through the exam structure in order, the next Study Guide to review is for sub-domain 4.4 which is deploying Machine Learning models in production.
Credits
Photo by Anthony Bressy on Unsplash
AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam
This study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic flashcards, and supplementary online resources. It is available in both paper and kindle version for immediate access. (Vist Amazon books)
Questions and answers

Pluralsight review – AWS Certified Machine Learning Specialty
Contains affiliate links. If you go to Pluralsight’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support. The AWS Certified Machine Learning Specialty learning path from Pluralsight has six high quality video courses taught by expert instructors. Two are introductory…

Whizlabs review – AWS Certified Machine Learning Specialty
Need more practice with the exams? Check out Whizlab’s free test with 15 questions. They also have three practice tests (65 questions each) and five section tests (10-15 questions each). Money off promo codes are below. For the AWS Certified Machine Learning Specialty Whizlabs provides a practice tests, a video course and hands-on labs. These…

CV Library
If you want to land your dream AWS job you have to do more than just dream about it you need a CV. Agents may call, email or text and job ads pop up on every site you visit but the first thing they will ask for is a copy of your CV. A CV…