Kinesis partition key falls always in the same sha

2019-05-04 07:05发布

问题:

I have a kinesis stream with 2 shards that looks like this:

{
    "StreamDescription": {
        "StreamStatus": "ACTIVE",
        "StreamName": "my-stream",
        "Shards": [
            {
                "ShardId": "shardId-000000000001",
                "HashKeyRange": {
                    "EndingHashKey": "17014118346046923173168730371587",
                    "StartingHashKey": "0"
                },
            {
                "ShardId": "shardId-000000000002",
                "HashKeyRange": {
                    "EndingHashKey": "340282366920938463463374607431768211455",
                    "StartingHashKey": "17014118346046923173168730371588"
                },
        ]
    }
}

The sender side sets a partition that is usually a UUID. It always falls in shard-002 above which makes the system not load balanced and therefore not scalable.

As side note, kinesis uses md5sum to assign a record and then send it to shard that contains the resulted hash in its range. In fact when i tested it on the UUId i used, they do fall always in the same shard.

echo -n 80f6302fca1e48e590b09af84f3150d3 | md5sum
4527063413b015ade5c01d88595eec11  

17014118346046923173168730371588 < 4527063413b015ade5c01d88595eec11 < 340282366920938463463374607431768211455

Any idea on how to solve this?

回答1:

First of all, see this Q&A: How to decide total number of partition keys in AWS kinesis stream?

About your situation; you have 2 shards, but their hash key range are not equal.

Number of partition keys shard 1 contains:

17014118346046923173168730371587 - 0 = 17014118346046923173168730371587

Number of partition keys shard 2 contains:

340282366920938463463374607431768211455 - 17014118346046923173168730371587 = 340282349906820117416451434263037839868

There is a big difference between those two;

17014118346046923173168730371587 : 17 x 10^30

340282349906820117416451434263037839868 : 34 x 10^37

It would be awesome if shard 1 was between "0 - 170141183460469231731687303715884105727" and shard 2 was between "170141183460469231731687303715884105728 - 340282366920938463463374607431768211455".

You've probably used a desktop or another lower precision calculator. Try a better calculator. See the example below;

package com.cagricelebi.kinesis.core.utils;

import java.math.BigInteger;

public class MyCalc {

    public static void main(String[] args) {
        try {

            String num1 = "340282366920938463463374607431768211455";
            String num2 = "-17014118346046923173168730371587";

            String diff = bigCalc(num1, num2, "1", "1");
            System.out.println("result1 : " + diff); // 340282349906820117416451434263037839868

            String optimumHalf = bigCalc(num1, "0", "1", "2");
            System.out.println("result2 : " + optimumHalf); // 170141183460469231731687303715884105727

        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    /**
     * Basic calculator.
     * First adds up first two elements, than multiplies the summation.
     * The result is the division of the multilication to divisor.
     *
     * @param bigInt A
     * @param bigInt2 B
     * @param multiplicator C
     * @param divisor D
     * @return ((A+B)*C)/D
     */
    private static String bigCalc(String bigInt, String bigInt2, String multiplicator, String divisor) {
        BigInteger summation = new BigInteger(bigInt).add(new BigInteger(bigInt2));
        BigInteger multiplication = summation.multiply(new BigInteger(multiplicator));
        BigInteger division = multiplication.divide(new BigInteger(divisor));
        return division.toString();
    }

}


回答2:

After a few hours of investigation, I found the root cause, again human errors. Sharing the solution here even if it's a simple to save the time someone else could spend on it.

The problem arose due to the way the original stream was split. When you split a stream with one shard, you have to calculate the starting hash key of the new child shard. This new hash key is usually in the middle of the parent shard hash key range.

A newly created shard(the parent) will have the following range:

0 - 340282366920938463463374607431768211455

So naively you go to your Windows calculator and copy paste this "340282366920938463463374607431768211455" and then divide it by 2.

The issue I missed and can easily be missed is the fact that the Windows calculator actually truncates number without letting you know. The above number pasted in the calculator will now be "34028236692093846346337460743176" . Once you divide it by 2 you will actually get a number that is very small compare to range of the parent shard, and then your records will not be distributed, they will go to the shard that got the bigger portion of the range.

Once you take the number above to calculator adapted for big numbers you will get right the middle of the range. I used this to calculate the range : https://defuse.ca/big-number-calculator.htm .

After this change, the records are perfectly distributed and the system scales nicely.