DynamoDB database design (key store, noSQL)
I am used to MySQL and am now trying to figure out how to use key value stores. What I haven't seen is a good noob as examples of database design and how you insert and retrieve information.
Is this the correct idea for how you store data from MySQL in a key value store?
TYPE: MySQL TABLE: users COLUMNS: user_id(primary), username, location TYPE: Key Value Store TABLE: users KEY: user_id VALUES: username, location
So if I am correct above. Pulling out general information about a user is easy enough to understand. But how could I execute the following query on the key value store?
SELECT username FROM users WHERE location = 'mexico'
The way I thought you could do it easily is to create another table. (assuming more than 5000 users, I'm sure there are other ways to do this if you only have a couple hundred)
--Original Table-- TYPE: Key Value Store TABLE: users KEY: user_id VALUES: username, location --Additional "query" Table-- TYPE: Key Value Store TABLE: user-location KEY: location VALUES: user_id
However, now we need to set up two tables when someone joins, updates their location, etc. This is not a huge deal, I suppose you just have to be very precise with your application code.
Is this the best way to solve these problems? Or am I missing something?
source to share
Updated answer (Jan-2014)
DynamoDB started supporting Global Secondary Indices , which means you can now put an index into a location and quickly get only those that live in mexico.
Please note that at the time of writing (this may change) you cannot add indexes to existing tables.
Original answer (Mar-2013)
Notes on NoSQL in general:
NoSQL DBMS usually focuses on scalability.
They also usually add application overhead in terms of more server side code.
You have to ask yourself "how many times will I need to query users from mexico" The
answer is likely to guide you on the right approach when modeling your database.
This is also the reason that there are no "perfect fits" and no really "noob patterns" (at least to my knowledge)
Now, looking at DynamoDB in particular, you don't have the luxury of secondary indexes (unlike other NoSQL solutions that are), so you need to create index tables. In your model, you can create a table where the hash key is the location and the range key is the user id. So by using the QUERY API request you can get all MEXICO users.
You can also think of other implementations, such as storing IDs concatenated into a single object, but again, since DynamoDB only allows 64KB objects - you are likely to run into a scaling issue here.
source to share
If your design is such that you end up doing a lot of location based searches, then you should redesign the user table with Location as hashkey and userId as the range key. But the above method removes the ability to query users by their name or user ID, and when a new user is inserted, the uniqueness in the userID cannot be checked (which contradicts the primary key in MySql).
Now, if you don't search by location quite often, then performing a scan operation may be the best solution.
The best approach would be, as you mentioned, to do all this API level processing based on your needs.
source to share