photograph of a waterfall and stream to symbolize Kinesis KPL vs API

Kinesis KPL vs API

The Kinesis Producer Library (KPL) and the Kinesis API can both be used to send data to Kinesis Data Streams. The advantage of the KPL is it provides a lot of added features, such as failed transmission handling built in. If you use the Kinesis API you have to code these features yourself. The advantages of Kinesis API are that records can be sent without delay, whereas the Kinesis KPL has buffering built in. Also, the Kinesis API is accessed via AWS SDKs which are available in many programming languages including the Data Scientist’s favorite, Python. KPL is available as a Java library only.

Last updated: 5 June 2021

FeaturesKinesis Producer Library (KPL)Kinesis API
DelayPossible, due to the buffer featureInstant, no delay.
LanguagesJava onlyMany including Python
Failed transmission handlingBuilt inMust be coded
Available featuresRestricted to features relevant to ProducersAPI provides complete access to all Kinesis features
A table comparing the Kinesis KPL vs API

Video: Amazon Kinesis Consumers Explained

8.08 minutes video by Stephane Maarek

Kinesis Producer Library (Kinesis KPL)

The Kinesis Producer Library is used to transmit records to Kinesis Data Streams. It acts as an intermediary between the producer application code and Kinesis Data Stream API actions. The KPL sits on the Kinesis API and provides a subset of functions specific to Producers.

Video: How can I put records into an Amazon Kinesis data stream using the KPL?

9.44 minutes video from AWS

KPL pros

  1. Performance benefits
  2. Consumer- side ease of use via the Kinesis Client Library
  3. Producer monitoring via CloudWatch
  4. Asynchronous architecture, KPL has a buffer to store records whilst they are processed

KPL cons

  1. Buffering can delay feeding data to Kinesis
  2. KPL is a Java library

Kinesis API

The Kinesis API is used by the AWS Software Development Kit (SDK) to add records to Kinesis Data Streams. There are SDKs for many languages including Python. The API exposes the full range of capabilities of Kinesis Data Streams, not just those concerned with data producers.

Data is transmitted using two classes:

  1. PutRecord – this sends records one at a time
  2. PutRecords – this class sends batches of records, up to 500, for higher throughput

When the programmer has to handle records that fail to get sent.

Kinesis Client Library (KCL)

The Kinesis Client Library (KCL) sits on the Kinesis API and calls the Kinesis Producer Library to help extract user records from Kinesis Data Stream records. The KCL is a Java library, although it may be used by other languages via the MultiLangDaemon.

KCL uses checkpointing to ensure all records are recovered. To do this it uses DynamoDB to store check pointing data. Note, if the DynamoDB is under provisioned throttling may be experienced. So the throughput of the DynamoDB database must be in balance with the provisioned throughput of the Kinesis Data Streams. KCL provides connectors to Amazon DynamoDB, Amazon Redshift, Amazon S3, and Elasticsearch.

The Kinesis Connector library is a legacy, deprecated library.

an infographic to symbolize the comparrison Kinesis KPL vs API
Add this infographic revision card to your Pinterest account

Summary

Kinesis Data Streams is served by two methods to ingest data and one to extract it. The Kinesis Producer Library and Kinesis API are used to feed data in and the Kinesis Client Library is used to extract it.

This study guide is part of subdomain 1.2, Identify and implement a data-ingestion solution. This is part of the Data Engineering domain.

Credits

AWS Certified Machine Learning Study Guide: Specialty (MLS-C01) Exam

Contains affiliate links. If you go to Amazon’s website and make a purchase I may receive a small payment. The purchase price to you will be unchanged. Thank you for your support.

This study guide provides the domain-by-domain specific knowledge you need to build, train, tune, and deploy machine learning models with the AWS Cloud. The online resources that accompany this Study Guide include practice exams and assessments, electronic flashcards, and supplementary online resources.

Similar Posts