Google Cloud storage only stores a unique entity

Question

Google Cloud storage only stores a unique entity

I'm trying to learn NoSQL from Google Datastore, but I'm having a problem with uniqueness.

Consider an e-commerce store, it has categories and products.

You don't need two products of the same SKU in the database.

So, I insert an object with JSON:

{"sku": 1234, "product_name": "Test product"}

And it appears with two fields. But then I can do it again and I have two or more of the same product.

How do you avoid this? Can you make the sku field unique?

Do I need to query before inserting?

The same thing happens with categories. Should I just use one entity for ALL my categories and account for it in my JSON?

What's good practice here?

+3

database google-app-engine transactions google-cloud-datastore google-cloud-platform

John May 11 '17 at 20:33

source to share

3 answers

Dan McGrath · Answer 1 · 2017-05-11T21:05:43+0000

Create a new view called "sku". When you create a new product, you will want to do a transactional insert of both the product object and the sku object.

For example, let's say you want to add a new product named View product

with ID abc

:

"product/abc" = {"sku": 1234, "product_name": "Test product"}

To ensure that the "sku" property is unique, you always want to insert an object with the view name sku

and ID equal to the property value:

"sku/1234" = {"created": "2017-05-11"}

In the example above, the object has a property for the generated date - just something optional I introduced as part of the example.

Now, as long as you insert both of them as part of the same transaction, you will ensure that the "sku" property has a unique value. This works because:

Insert ensures that the write fails if the sku object for that number already exists
the transaction writes the product object (with the sku value), and the sku object is atomic - so if sku is not unique, the sku object write will fail, and the product object cannot be written as a result.

Andrei Volgin · Answer 2 · 2017-05-11T21:26:40+0000

You can use "sku" as "id" (if it's a number) or "name" (if it's a string) for your entity, instead of storing "sku" as a property. It then guarantees uniqueness as it becomes part of the unique key of the entity.

Askar · Answer 3 · 2017-05-12T19:54:43+0000

The data model is a big topic, but IMO has two approaches you can choose from. This is more fundamental, very specific to your question. This gives some ideas.

The first approach is to store the link as a property

The same as thinking about a product contains product variants ...

This approach is similar to the RDBMS world. You can create products separately and each product will have a link in each product variation. This is similar to how foreign keys work in databases. This way you will have a new property for the product variant objects that will contain a reference to the product to which it belongs. The product attribute will actually contain the key of the product type object. If this sounds confusing, this is how you can crack it. I'll use python as an example:

# product model
class Product(ndb.Model):
    name = ndb.StringProperty()

# product variant model
class ProductVariant(ndb.Model):
    name = ndb.StringProperty()
    price = ndb.IntegerProperty()
    # product key.
    product = ndb.KeyProperty(kind=Product)

hugoboss = Product(name="Hugo Boss", key=ndb.Key(Product, 'hugoboss'))
gap = Product(name="Gap", key=ndb.Key(Gap, 'gap'))

pants1 = ProductVariant(name="Black panst", price=300, product=hugoboss.key)
pants2 = ProductVariant(name="Grey pants", price=200, product=hugoboss.key)
tshirt = ProductVariant(name="White graphic tshirt", price=10, product=gap.key)

pants1.put()
pants2.put()
tshirt.put()

# so lets say give me all pants that has label hugoboss
for pants in ProductVariant.query(ProductVariant.product == hugoboss.key).fetch(10):
    print pants.name

# You should get something:
Black pants
Grey panst

The second approach is the product in the key

To take full advantage of this, you need to know about the Datastore build on top of Bigtable (Datastore build on top of Bigtable) function and how data is manipulated around it. if you want to dive deep, there is a great paper Bigtable: Distributed Storage for Structured Data

# product model
class Product(ndb.Model):
    name = ndb.StringProperty()

# product variant model
class ProductVariant(ndb.Model):
    name = ndb.StringProperty()
    price = ndb.IntegerProperty()

hugoboss = ndb.Key(Product, 'hugoboss')
gap = ndb.Key(Product, 'gap')

Product(name="Hugo Boss", key=hugoboss).put()
Product(name="Gap", key=gap).put()

pants1 = ProductVariant(name="Black pants", price=300, parent=hugoboss)
pants2 = ProductVariant(name="Grey pants", price=200, parent=hugoboss)
tshirt = ProductVariant(name="White graphic tshirt", price=10, parent=gap)

pants1.put()
pants2.put()
tshirt.put()

# so lets say give me all pants that has label hugoboss
for pants in ProductVariant.query(ancestor=hugoboss).fetch(10):
    print pants.name

# You should get something:
Black pants
Grey pants

The second approach is very powerful! Hope this helps.

Google Cloud storage only stores a unique entity

More articles: