K-Means Algorithm

The K-Means Algorithm is an Unsupervised Learning algorithm used to find clusters. The clusters are formed by grouping data points that are as similar as possible to each other and different from other data points. The distance between data points are calculated and averaged to form groups. K-Means is used for market segmentation, computer vision,…

Sequence-to-Sequence Algorithm

Sequence-to-Sequence Algorithm

SageMaker Sequence-to-Sequence algorithm is used for machine translation of languages. The algorithm takes the input sequence of tokens, for example French words, and outputs the translation as a sequence of English words. As well as translation, Sequence-to-Sequence can be used to summarize a document and convert speech to text. Sequence-to-Sequence is a Supervised Learning algorithm….

Neural Topic Model Algorithm

Neural Topic Model Algorithm

The Neural Topic Model Algorithm (NTM) is used to identify topics in a corpus of documents. NTM uses statistics to group words. The groups are termed Latent Representations because they are identified via word distributions in the documents. The Latent Representations reveal the semantics of the documents and so outperform analysis using the word form…

Latent Dirichlet Allocation Algorithm

Latent Dirichlet Allocation Algorithm

SageMaker Latent Dirichlet Allocation algorithm (LDA) is an Unsupervised Learning algorithm that groups words in a document into topics. The topics are found by a probability distribution of all the words in a document. LDA can be used to discover topics shared by documents within a text corpus. The number of topics is specified by…

Image Classification Algorithm

Image Classification Algorithm

The SageMaker Image Classification algorithm can apply multiple labels to an image depending on what objects are identified. Objects are either identified, or not, there are no probability scores. Attributes Problem attribute Description Data types and format Image Learning paradigm or domain Image Processing, Supervised Problem type Image and multi-label classification Use case examples Label/tag…

BlazingText Algorithm

BlazingText Algorithm

BlazingText is the name AWS calls it’s SageMaker built-in algorithm that can identify relationships between words in text documents. These relationships, which are also called embeddings, are expressed as vectors. The semantic relationship between words is preserved by the vectors which cluster words with similar semantics together. This conversion of words to meaningful numeric vectors…