EAV Model for Stock System or Another Approach?

I am developing an inventory and warehouse management system using relational databases (MySQL) and PHP. Due to the fact that warehouse products will have several characteristics (width, height, weight, measures, colors, etc.), it becomes necessary to have an approach to the memory model for storing attributes and the ability to add / edit new attributes, change product types etc. So, in the current concept, I only see 3 viable models:

  • store all attributes in one table, as separated columns and based on product type (probably category) to serve them until the end of the user to populate
  • an EAV (Entity - Attribute - Value) model that would include something like this:
    • category table containing attribute classes
    • an attribute table class that will contain separate classes with several attributes (this way we ensure that we can add an attribute class to a category without having to manually add attributes to similar categories one by one)
    • attribute table responsible for the attribute itself
    • the attribute value table in which we store the values
  • Store all common attributes in one table and create multiple tables for all category types: this model will need to change the database every time we encounter a new category type

The second model is inspired here .

After reading a lot about the EAV model, I am now doubting this model and I am a little concerned about how I would have to plug in different product attributes in orders / invoices, etc. Even form validation seems like it would be a real pain of using the EAV model, but still .. I wouldn't want to have a separate table with 100+ columns and then be prepared to add new columns whenever a new attribute is added ..

So the question will be: is there a cheaper solution? Or can the EAV model be improved?

I know this is a long and old discussion, but everyone just points to NoSQL and I only rely on RDBMS.

EDIT:

The downside to these approaches (or most of the approaches found) is that:

  • a unit of measure should probably exist for the specified attribute (the specific gravity of the attribute should have a drop with units).
  • the specified attribute must be required or not contain
  • all attributes must be validated on the submit form

So far, the only possible solution would be to create a new table for each new category and wrap all custom attributes and rules in this table. But again, this will lead to real pain when a new category is created.

EDIT 2:

The ability to use a Json column in MySQL does not solve any of the disadvantages mentioned above from my point of view. OR maybe I am wrong and I do not see a clear picture.

+3


source to share


4 answers


I understand that these are your basic requirements:

  • Flexible attributes
    • Your specific need here is unclear: it looks like you are either expecting the attributes to change, or at least expecting all attributes to not always apply to all products (i.e. sparse matrix)
  • Products are also classified, and the category will (at least in part) determine which attributes apply to the product.
  • Attributes themselves can have additional properties besides their value, which must be provided by the user (i.e. a unit that comes with a weight)
  • Input validation is required and checks things like:
    • All required attributes are present
    • Attributes that are not applicable are missing
    • Attributes have valid values
    • Custom attribute properties have valid values
  • You might also want to make sure you can search / filter attributes efficiently

These different requirements lead to different technical needs and different technical solutions. Some of them are database-specific and some of them must be addressed in code regardless of the database choice. Obviously, you are aware of some of these problems, but I think it's worth really breaking it down:

Flexible attributes

Having a list of flexible attributes (as you know) does not work well with RDBMS systems where a predefined table schema must be defined. This includes almost all SQL queries and of course MySQL. The problem is that changing the schema of a table is expensive and for large tables it can take minutes or hours, making it impossible to add attributes if you need to add a column to the table to do so.

Even if your attribute list rarely changes, a large attribute table is very inefficient if most products don't matter for most attributes (i.e. sparse matrix).

Ultimately you just won't get anywhere if your attributes are going to be stored as a column in tables. Even if you split it into one category, you will still have large empty tables that you cannot dynamically add columns to.

If you are using an RDBMS, the only option is the EAV system. After reviewing, researching, and implementing EAV systems, I wouldn't bother too much about everything you've heard about them on the Internet. I know there are many articles out there talking about the "anti-pattern" of EAV, and I am the kind of person who takes software design patterns seriously, but EAV has exactly the right time and place, and this is it. Ultimately you cannot do this on an RDBMS without EAV. You can of course have a look at a NoSQL system designed for this particular problem, but when the rest of your database is in a standard DBMS, installing or migrating to a NoSQL system just to store your attribute values ​​will almost certainly overdo it. You certainly don't want to lose alignment with the ACID that RDMBS ships with.and most NoSQL systems do not guarantee ACID compliance. There is a wave of NewSQL systems out there that are designed to get the best of both worlds, but if this is just one piece of a larger application (which I'm sure it is), it probably isn't worth exploring new technologies entirely just to this feature appeared. You might also consider using something like JSON storage in MySQL to store your attribute values. This is a viable option now that MySQL has better JSON support, but it only makes a small change for a large image: you will still need all of your other EAV tables to keep track of allowed attributes, categories, etc. This is just an attribute that you could put inside the JSON data,so the potential benefits of JSON storage are relatively small (and I have other issues that I will discuss in the future).

Thus, I would say that while the rest of your application is running on an RDBMS, it makes perfect sense to use EAV to manage flexible attributes. If you have tried to create an entire system in EAV inside an RDBMS, then you will definitely be wasting your time and I would tell you to find a good NoSQL database that matches the problem you are trying to solve. The disadvantages of EAV still apply though: you cannot easily perform consistency checks on your RDBMS system and have to do it yourself in code.

Categorized products with category attributes



You pretty much got it here. This is relatively straight forward in the EAV system. You will have an attribute table, you will have a category table, and then you will need a standard one-to-many or many-to-many relationship between the attribute and category tables, which will determine which attributes are available for which category. You obviously also have a relationship between products and categories, so you know which products are therefore needed, which attributes.

Your option # 3 is designed to fulfill this requirement, but having a table with each attribute as a column will scale very little as your system grows and will certainly break if you ever need to add attributes dynamically. You don't want to run ALTER TABLE statements on the fly, especially if you have more than a few thousand records.

Managing Attribute Properties

Store dynamic attributes and values ​​uniquely. Another issue is how to store dynamic attributes, values ​​and associated metadata (i.e., store the weight as well as the unit weight). However, this is no longer a database problem, but a code problem. As far as actually storing the information, the best option is probably to store your metadata inside your attribute value table and use some code abstraction to handle input validation as well as form creation. It can get pretty complicated pretty quickly, especially if done wrong, and speaking through such a system will take another post. However, I think you are on the right track: for a fancier attribute that requires both a value and metadata, you need to assign a class somehow,which is responsible for handling the input and validating the form. For example, for a simple text field, you have a "text" class that reads the user's value from a form and stores it in the appropriate "attribute_values" table without metadata. Then for your "weight" attribute, you will have a "weight" attribute that stores the user-specified number (ie 0.5), but then also stores the user-specified device with that number (eg "lbs"). and saved as table "attribute_values" (in pseudo-SQL):you will have a "weight" attribute that stores the user-specified number (ie 0.5), but then also stores the user-specified device with that number (eg "lbs") and is saved as table "attribute_values "(in pseudo-SQL):you will have an attribute "weight" that stores the user-specified number (ie 0.5), but then also stores the user-specified device with that number (for example, "lbs") and is saved as table "attribute_values "(in pseudo-SQL):INSERT INTO attribute_values value='0.5', meta_data='{"unit":"lbs"}', product_id=X, attribute_id=X

... Ironically, JSON would probably be a good way to store this metadata, since the metadata stored will also differ in attribute type, and I doubt you would use another layer of tables to handle this difference in your EAV tables.

Again, this is more of a code issue than a storage issue. If you decided to make JSON tables, the big picture to match this requirement hasn't changed: your "attribute type classes" would simply store metadata differently. It probably looks something like this:UPDATE products SET attributes='{"weight":0.5,"unit":"lbs"}' WHERE id=X

Input validation

This will need to be handled solely by code no matter how you store your data, so this requirement doesn't really matter in terms of defining your database structure. A class-based system as described above will also be able to handle input validation if done properly.

Sort / Search / Filter

It doesn't matter if you exclusively use your attributes for storing / retrieving data, but will you be looking for attributes at all? With a proper EAV system and good indexes, you can efficiently perform searches / sorts on the RDBMS system (although it can start to get painful if you search more than a few indexes at a time). I haven't looked in detail, but I'm pretty sure using JSON for storage won't scale well when it comes to searching. While MySQL canworking with JSON now and looking for columns directly, I seriously doubt that such a search / sort uses MySQL indexes, which means it won't work with large databases. I could be wrong about this. Before you start setting up your MySQL / JSON storage, you should get some work done if you are going to do something like this.

Depending on your needs, this is also a good place to complement your RDBMS system with a NoSQL system. After running a high volume e-commerce system (~ 1.5M Products), I found that MySQL tends to drop in the search / sort category, especiallyif you are doing any text search. In an e-commerce system, a query such as: "Show me the results that best match the term" blue truck "and have the attribute" For ages 3-5 "are common, but doing something like this in MySQL is almost impossible, primarily from -for the need to sort and score based on relevance. We solved this problem using Apache Solr (Elastic is a similar solution) and it worked very well for search / sort / search. In this case it was two database solutions. MySQL saved all actual data and stored attributes in EAV tables, and anytime something was updated, we added a record of everything in Apache Solr for additional storage.When the request came from the user, we requested Apache Solr,who was an expert in text search and also handled filtering attributes without issue, and then we would pull the complete product record from our MySQL database. The system worked great. We had 1.5 million products, thousands of custom attributes, and managed everything with one virtual server without issue. Obviously there is a lot of code going on behind the scenes, but the point is that it definitely worked and was not difficult to maintain. Never had any performance issues from MySQL or Solr.that there is a lot of code going on behind the scenes, but the point is, it definitely worked and was not difficult to maintain. Never had any performance issues from MySQL or Solr.that there is a lot of code going on behind the scenes, but the point is, it definitely worked and was not difficult to maintain. Never had any performance issues from MySQL or Solr.

+5


source


Well, this is just one approach. You could simplify this if you don't need or want all of this.



You can, for example, use a Json column in Mysql to store all the extra attributes. Another idea in the product type is to add a json column to store custom attributes and types and use that to draw the form on the screen.

+2


source


I would recommend that you start the EAV database first to understand the creation of the database and its implications.

You can follow the magenta DB structure that uses the EAV model.

EAV stands for Entity Attribute and Value Model. Let's take a close look at all the parts.

Entity: Items are presented as an entity, it can be a product or a customer or a category. Each object has a record in the database.

Attribute: They belong to another entity, for example, a Customer object has attributes like name, age, address, etc. In Magento database, all attributes are listed in one table.

Value: just attribute values, eg for the Name attribute, the value will be "Rajat".

EAV is used when you have many attributes for an object and that attribute is dynamic (add / remove). There is also a high probability that many of these attributes will be empty or null most of the time. In such a situation, the EAV framework has many advantages, mainly with optimized mysql storage

For your case. Category can also have attributes, products can also have attributes so on with customers, etc.

Let's take an example of categories. Below are the tables provided by magento:

1. catalog_category_entity
2. catalog_category_entity_datetime
3. catalog_category_entity_decimal
4. catalog_category_entity_int
5. catalog_category_entity_text
6. catalog_category_entity_varchar
7. catalog_category_flat

      

Follow this link to learn more about the table

Magento Category Tables

For attributes that are a select box. You can put the dropdown values ​​below the option values.

Follow the link to understand the magento eav framework, which will give you a clear understanding of how the EAV model works and how you can best use it.

purple tree texture

+2


source


There are three approaches if you want to stick with a relational database.

The first is best if you know in advance the attributes for all products. You have chosen one of three ways to store polymorphic data in a relational model.

It's relationally "clean" —you just use rows and columns, but each of the three options has advantages and disadvantages.

If you don't know your attributes during development, I would recommend against these solutions - they will require significant additional tools.

The next option is EAV. The advantages and disadvantages are well documented, but your focus on "validating input forms" is only one use case for data, and I think you could easily find your data to be "write only". For example, sorting / filtering becomes very difficult ("finding all products with a height of at least 12 and sorting by material_type" is almost impossible using the EAV model).

The option I prefer is a combination of relational data for core, invariant data and document-oriented (JSON / XML) for option data. MySQL can query JSON natively - so you can sort / filter by variant attributes. You should create your own validation logic, although perhaps integrating JSON Schema in data entry applications.

Using JSON Schema, you can enter concepts that "belong together" and provide search values. For example, if you have the weight of a product, your schema might say that weight should always be in a unit of measure, with kilogram, milligram, ounce, pound, and so on.

If you have a foreign key relationship in the variant data, you have a problem - for example, "manufacturer" might refer to a manufacturer table. You can either model this as an explicit column or in JSON and not use the built-in SQL foreign key tools like joins.

0


source







All Articles