Machine Learning books on bookshelf

SageMaker text processing algorithms

There are four SageMaker text processing algorithms: BlazingText, LDA, NTM and Sequence-to-sequence. BlazingText converts text to numeric vectors. LDA and NTM identify topics in text documents and Sequence-to-sequence provides machine translation of languages. Each algorithm has it’s own section and embedded video.

These revision notes are part of subdomain 3.2 Select the appropriate model(s) for a given machine learning problem of the exam syllabus.

These revision notes describe the four SageMaker text processing algorithms. Each one processes text differently, although LDA and NTM have the same use case. BlazingText is a precursor for downstream Natural Language Processing. LDA and NTM both provide topic modeling of a large document corpus. Sequence-to sequence performs machine translation of languages.

A photograph of a woman reading a newspaper to symbolize the SageMaker text processing Neural Topic Model (NTM) Algorithm
Modeling (Domain 3)

Neural Topic Model Algorithm

The Neural Topic Model Algorithm (NTM) is used to identify topics in a corpus of documents. NTM uses statistics to group words. The groups are termed Latent Representations because they are identified via word distributions in the documents. The Latent Representations reveal the semantics of the documents and so outperform analysis using the word form…

Two news papers, one in French and one in English to symbolize the SageMaker text processing algorithm Sequence-to-sequence which performs machione translation of languages
Modeling (Domain 3)

Sequence-to-Sequence Algorithm

SageMaker Sequence-to-Sequence algorithm is used for machine translation of languages. The algorithm takes the input sequence of tokens, for example French words, and outputs the translation as a sequence of English words. As well as translation, Sequence-to-Sequence can be used to summarize a document and convert speech to text. Sequence-to-Sequence is a Supervised Learning algorithm….

a photgraph of a curving library bookshelf to symbolize the SageMaker text processing algorithm LDA
Modeling (Domain 3)

Latent Dirichlet Allocation Algorithm

SageMaker Latent Dirichlet Allocation algorithm (LDA) is an Unsupervised Learning algorithm that groups words in a document into topics. The topics are found by a probability distribution of all the words in a document. LDA can be used to discover topics shared by documents within a text corpus. The number of topics is specified by…

a photograph of a burning book held in a hand to symbolize the SageMaker built-in algorithm BlazingText
Modeling (Domain 3)

BlazingText Algorithm

BlazingText is the name AWS calls it’s SageMaker built-in algorithm that can identify relationships between words in text documents. These relationships, which are also called embeddings, are expressed as vectors. The semantic relationship between words is preserved by the vectors which cluster words with similar semantics together. This conversion of words to meaningful numeric vectors…

AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam

Contains affiliate links. If you go to Amazon’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support.

This study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic flashcards, and supplementary online resources.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *