Should the event hub have same number of partition

2020-06-03 04:25发布

问题:

For Azure event hub 1 though put unit equals 1MB/sec ingress. So it can take 1000 messages of 1 KB. If I select 5 or more throughput units would I be able to ingest 5000 messages/ second of 1KB size with 4 partitions? What would be egress in that case? I am not sure about limitation on Event Hub partition, i read that it is also 1MB/sec. But then does that mean to use event hub effectively i need to have same number of partitions?

回答1:

Great Qstn!

Few Basics

1 ThruPut Unit (hereby, I will call TU) - means 1 MB/Sec or 1000msgs/sec - whichever happens first. You pay for TU's => you can change TUs as per your load requirements. This is your knob on the Bill. And TU is set on a given Event Hubs Namespace!

So, when you buy 1 TU for an EventHubs Namespace with 10 EventHubs in it, it means that, cumulative usage of all of these 10 EventHubs can be 1TU.

Individually, each Partition is CAP'ed (aka MAX'ed out) by 1 MB/Sec or 1000 msgs/sec ingress - whichever happens first. Although, sometimes you might get lucky in some regions where load is less - this is the only Guarantee offered by Azure EventHubs service. Consider these principles while deciding on no. of partitions in eventhub for your service:

  1. One of Our intents to provide Partitions - is to offer high-availability. If you are sending to eventhubs and want the sends to succeed, NO MATTER WHAT HAPPENS on service - you should create multiple partitions and send using EventHubClient.Send (which doesn't stick the send to a partition).
  2. And, also other major factor: No. of partitions - will determine how fat the Event Pipe is & how fast/parallel you can receive & process the events. If you have 10 partitions on your EventHub - it's capacity is MAX'ed to 10 TUs. You can create 10 epoch receivers in parallel & consume & process events. If you envision that the EventHub that you are currently creating now - can quickly grow 10-fold - create as many partitions and keep the TU's matching the current load. Analogy here is like - having multiple lanes on a Freeway!

One more, most important point to remember is, TU is configured at namespace level. And, ONE Event Hubs namespace can have multiple eventhubs. Each eventhub can have different no. of partitions.

Answers:

if you select 5 or more TUs on the Namespace and have only 1 EventHub with 4 partitions - you will get a max. of 4 MBPS or 4K msgs/sec.

Egress max. will be 2X of Ingress (8 MBPS or 8K msgs/sec). In otherwords, you could create 2 patterns of receives (slow and fast etc) by creating consumergroups - depicting your receive pattern. If you need more than 2X parallel receives - then you will need to by more TUs.

Yes, each EventHubs Partition limit is 1 TU (1 MBPS or 1K Msgs/sec).

Yes. Ideally, you will need more partitions than TUs. First, model your partition count as mentioned above. Start with 1 TU while you are developing your Solution. Once done, when you are doing load testing or going live, increase TUs to tune to your load. Remember, you could have multiple eventhubs in a Namespace. So, having 20 TUs at Namespace level and 10 eventhubs with 4 partitions each - can deliver 20 MBPS across the Namespace.

more on Event Hubs...



回答2:

One partition goes to one TPU. Think of TPUs as a processing engine. You can't take advantage of more TPUs than you have partitions. If you have 4 partitions, you can't use more than 4 TPUs.

It's typical to have more partitions than TPUs, for the following reasons

  • You can scale the number of TPUs up if you have a lot of traffic, but you can't change the number of partitions
  • You can't have more concurrent readers than you have partitions. If you want to have 5 concurrent readers, you need 5 partitions.

As for throughput, the limits are 1 MB ingerss/2 MB egress per TPU. This covers the typical scenario where each event is sent both to cold storage (eg a database) and Stream analytics or an event processor for analysis, monitoring etc.