We are about to implement the Read portion of our CQRS system in-house with the goal being to vastly improve our read performance. Currently our reads are conducted through a web service which runs a Linq-to-SQL query against normalised data, involving some degree of deserialization from an SQL Azure database.
The simplified structure of our data is:
- User
- Conversation (Grouping of Messages to the same recipients)
- Message
- Recipients (Set of Users)
I want to move this into a denormalized state, so that when a user requests to see a feed of messages it reads from EITHER:
A denormalized representation held in Azure Table Storage
- UserID as the PartitionKey
- ConversationID as the RowKey
- Any volatile data prone to change stored as entities
- The messages serialized as JSON in an entity
- The recipients of said messages serialized as JSON in an entity
- The main problem with this the limited size of a row in Table Storage (960KB)
- Also any queries on the "volatile data" columns will be slow as they aren't part of the key
A normalized representation held in Azure Table Storage
- Different table for Conversation details, Messages and Recipients
- Partition keys for message and recipients stored on the Conversation table.
- Bar that; this follows the same structure as above
- Gets around the maximum row size issue
- But will the normalized state reduce the performance gains of a denormalized table?
OR
A denormalized representation held in SQL Azure
- UserID & ConversationID held as a composite primary key
- Any volatile data prone to change stored in separate columns
- The messages serialized as JSON in a column
- The recipients of said messages serialized as JSON in an column
- Greatest flexibility for indexing and the structure of the denormalized data
- Much slower performance than Table Storage queries
What I'm asking is whether anyone has any experience implementing a denormalized structure in Table Storage or SQL Azure, which would you choose? Or is there a better approach I've missed?
My gut says the normalized (At least to some extent) data in Table Storage would be the way to go; however I am worried it will reduce the performance gains to conduct 3 queries in order to grab all the data for a user.