CouchBase: the control in which node / bucket receives data

Question

CouchBase: the control in which node / bucket receives data

My understanding of the couchBase clustering approach is to ensure that every node in the cluster receives an equal distribution of data. My question is, is there a way to change this and define a custom key to "sensibly" route the document to a specific bucket in the cluster?

In my scenario, I have data related to a specific object (think client-project-task-item) in all of my data; I will have enough items to scale horizontally; however, each search will always refer to a given client-project problem for which the dataset is only of a moderate size.

I think the most efficient approach would be to split my data into client-project-task and pre-allocate, say 1000 sections.

I realize this will limit my scalability at some point, but the trade-off between not having to hit every section for every search makes it one that I'm willing to pay.

So, is there a way to create this type of splitting logic in CouchBase?

Alternatively - if all my data is sent across all buckets, and I define a view to query, will each query hit all my records to check if the record matches?

For example, I could have a total of 400 mil items, but on a client-project-task around 100k, so it seems like a good idea to search for 100k rather than 400mil

Any thoughts, suggestions Comments are welcome

Thank; Brent

+3

routing partitioning sharding couchbase-lite

brent 13 Aug 14 at 12:20

source to share