I am currently writing the simple Kinesis Client Library (KCL) in Golang version. One of the features that I want it for my simple KCL is load balancing shards across multiple record processors and EC2 instances. For example, I have two record processors (which will run in the separate EC2 instance) and four Kinesis shards. The load balancing feature will allow each record processors to process two Kinesis shards.
I read that Java KCL implemented this but I can't find the implementation in the library. My question is how am I going to implement this feature in Golang? Thank you.
KCL already does load balancing for you.
Here's a basic description of how it works today (keep in mind this is just the basics and is subject to change as Amazon improves the logic):
- When a worker (which may process multiple shards) starts up, it checks a central DynamoDB database for which shards are owned by workers (creating that database if necessary). This is the "leasing" table.
- A "lease" is a relationship between a worker and a shard
- Workers will process records for shards it owns an unexpired lease for
- Leases expire if the worker hasn't emitted a "heartbeat" for the lease before it expires (typically every few seconds) - this heartbeat essentially updates the DDB record
- It checks Kinesis stream for which shards are available, and updates the table if needed
- If any leases are expired, the worker will try to take ownership of the lease - at database level, use shardId as key and write it's workerId there.
- If a worker starts and all shards are already taken, it checks to see what the "balance" is - if it detects an imbalance (ie: "I own 0 shards and some other worker owns 10 shards"), it initiates a "steal shard" protocol - the old worker stops processing for that shard, and the new worker starts
You're of course free to examine the source code for KCL on github: https://github.com/awslabs/amazon-kinesis-client - hopefully this explanation gives you more context for how to understand KCL and adapt it to your needs.
Before you start writing your own client...it looks like there are some people who have already done this:
- https://github.com/nieksand/gokinesis
- https://github.com/runtakun/kinesis-connector-go
Another option you have is the KCL MultiLangDaemon. You can install a small runner program that does all the balancing for you, and then you just listen to the messages the daemon sends you and commit them back.
https://github.com/awslabs/amazon-kinesis-client#amazon-kcl-support-for-other-languages