SageMaker Sequence-to-Sequence algorithm is used for machine translation of languages. The algorithm takes the input sequence of tokens, for example French words, and outputs the translation as a sequence of English words. As well as translation, Sequence-to-Sequence can be used to summarize a document and convert speech to text. Sequence-to-Sequence is a Supervised Learning algorithm.
- AWS docs: https://docs.aws.amazon.com/sagemaker/latest/dg/seq-2-seq.html
- AWS blog: https://aws.amazon.com/blogs/machine-learning/create-a-word-pronunciation-sequence-to-sequence-model-using-amazon-sagemaker/
|Data types and format||Text|
|Learning paradigm or domain||Textual analysis, Supervised Learning|
|Problem type||Machine translation|
|Use case examples||Convert audio files to text, Summarize a long text corpus, Convert text from one language to other.|
Sequence-to Sequence requires the input record to be in recordIO-protobuf with integer values only. Training input files:
Model artifacts and inference
|Learning paradigm||Supervised learning|
|Request format||JSON, recordIO-protobuf|
|Result||Same format as the choice of request format used|
|Batch request format||JSON lines|
|Batch result||JSON lines|
Sequence-to-Sequence can only use a single GPU instance, however the single instance may contain multiple GPUs.
Video: Amazon SageMaker’s Built-in Algorithm Webinar Series: Sequence2Sequence
This is a one hour video from AWS.