DocumentDB PartitionKey and performance
I have a scenario where I store a large amount of third party data for ad-hoc analysis of business users. Most data queries will be complex, using multiple self-joins, projections, and ranges.
When it comes to choosing PartitionKey
DocumentDB to use in Azure, I see people advise using a logical separator like TenantId, DeviceId, etc.
PartitionKey
based on some kind of GUID or large integer, so it would be heavily parellelized on large reads.
With this in mind, I developed a test with two collections:
-
test-col-1
-
PartitionKey
- TenantId with about 100 possible values
-
-
test-col-2
-
PartitionKey
- a unique value assigned by a third party that follows the pattern "AB1234568". Guaranteed to be globally unique by a third party.
-
Both collections are installed at 100,000 RU.
In my experiment, I downloaded both collections of approximately 2000 documents. Each document is approximately 20KB in size and heavily denormalized. Each document is an order that contains several tasks, each of which contains users, prices, etc.
Request example:
SELECT
orders.Attributes.OrderNumber,
orders.Attributes.OpenedStamp,
jobs.SubOrderNumber,
jobs.LaborTotal.Amount As LaborTotal,
jobs.LaborActualHours As LaborHours,
jobs.PartsTotal.Amount As PartsTotal,
jobs.JobNumber,
jobs.Tech.Number As TechNumber,
orders.Attributes.OrderPerson.Number As OrderPersonNumber,
jobs.Status
FROM orders
JOIN jobs IN orders.Attributes.Jobs
JOIN tech IN jobs.Techs
WHERE orders.TenantId = @TentantId
AND orders.Attributes.Type = 1
AND orders.Attributes.Status IN (4, 5)";
In my testing, I adjusted the following settings:
- Default
ConnectionPolicy
- Recommendations
ConnectionPolicy
-
ConnectionMode.Direct
,Protocol.Tcp
-
- Different meanings
MaxDegreeOfParallelism
- Various
MaxBufferedItemCount
Collection with GUID PartitionKey was requested using EnableCrossPartitionQuery = true
. I am using C # and .NET SDK v1.14.0.
In my initial tests with the default settings, I found that querying the collection using the TentantId
PartitionKey was faster while taking an average of 3.765ms , compared to 4.680ms for the GUID collection.
When I set ConnectionPolicy
in Direct
with TCP
, I found that the collection request time TenantID
decreased by almost 1000ms to an average of 2.865ms , while the GUID collection increased by about 800ms to an average of 5.492ms .
Things got interesting when I started playing with MaxDegreeOfParellelism
and MaxBufferedItemCount
. Collection query times were TentantId
usually not affected because the query was not cross-compiled, however, the GUID collection sped up significantly, reaching values as fast as 450ms ( MaxDegreeOfParellelism
= 2000, MaxBufferedItemCount
= 2000).
Given these observations, why wouldn't you want to make it PartitionKey
as broad as possible ?
source to share
Things got interesting when I started playing with MaxDegreeOfParellelism and MaxBufferedItemCount. The TentantID collection request times were usually unaffected because the request was not cross-compiled, however the GUID collection sped up significantly, reaching values down to 450ms ( MaxDegreeOfParellelism = 2000, MaxBufferedItemCount = 2000).
MaxDegreeOfParallelism can set the maximum number of concurrent tasks enabled by an instance of ParallelOptions. As I already knew this is client side parallelism and it will cost you the CPU / memory resources you have on your site.
Given these observations, why not want to make the PartitionKey as wide as possible?
For write operations, we can scale all the keys of the partition to use whatever you have prepared. Whereas for read operations we need to minimize cross-section searches for lower latency.
Also, as mentioned in this official doc:
Partition key selection is an important decision that you must make during development. You must choose a property name with a wide range of values and even have access patterns.
It is best to have a partition key with many different values (100s-1000 minimum) .
To ensure the full bandwidth of the container, you must choose a partition key that distributes requests evenly across some of the partition key values.
For more details, you can refer to How to Split and Scale in Azure Cosmos DB and this tutorial on Channel 9 Azure DocumentDB Aluminum Scale - Splitting .
source to share