A photograph of hands on a table to symbolize the SageMaker K-Nearest Neighbor algorithm

K-Nearest Neighbors Algorithm

The K-Nearest Neighbors Algorithm is used to place data into a category for example in recommendation applications used for recommending products on Amazon, articles on Medium, movies on Netflix, or videos on YouTube. It returns results based on the nearest training data points to the sample datapoint, also called nearest neighbors

The K-Nearest Neighbors algorithm is used for classification and regression problems. For classification problems the most frequent label of the nearest neighbors is returned. For regression problems the average of the nearest neighbors is returned. K-Nearest Neighbours algorithm was developed in the early 1950s.

Attributes

Problem attributeDescription
Data types and formatTabular
Learning paradigm or domainSupervised Learning
Problem typeBinary/multi-class classification, Regression
Use case examplesPredict a numeric/continuous value; Predict if an item belongs to a category

Training

Training data formats are:

  • CSV
  • x-recordio-protobuf

Model artifacts and inference

DescriptionArtifacts
Learning paradigmSupervised Learning
Request formatCSV
JSON
JSON Lines
x-recordio-protobuf
Response formatCSV
JSON
JSON Lines
x-recordio-protobuf

Processing environment

Training: CPU, or GPU

Inference: CPU, GPU for larger batches

Credits

Hands photo by Clay Banks on Unsplash

Similar Posts