Modeling Yes / No attributes in the database

I have a fairly common scenario where a user can choose from a set of attributes. The attributes in the user interface are represented by check boxes.

For example:

Components: Harddrive (y / n), CPU (y / n), monitor (y / n), keyboard (y / n), etc.

In the past, I have usually modeled this scenario like this:

"PCs" 1:M "PC Components" M:1 "Components"

      

Another option is to make the "attributes" as y / n fields in the "PC" table.

eg.

PCs (table)
-----------
PCId(PK)
Harddrive(y/n)
CPU(y/n)
etc...

      

In the past, my rationale for going with one and the other depends on whether the user can enter new attributes. If the answer is yes, then I go with the first option, if the answer is no, then I usually go with y / n attributes.

However, I now have a scenario that has about 20 attributes, divided into several categories. After creating the ERD, it just looks "wrong" and the table has an absurd number of columns.

My question is, is there a standard / correct way to model this? If so, does it have a name?

+2


source to share


2 answers


It is tempting to design a more "compact" data model (something different from a column for each attribute) because the attributes have compatible data types (they are all y / n).

If each attribute had different data types, for example if they were limited to a set of values ​​using different lookup tables, it wouldn't matter that you have to use a column for each attribute.



See Normal Form Domain Key . Modeling your y / n attributes as strings means that there is no way to represent required attributes (for which you must have either a Y value or an N value). So you will have some kind of restriction that there must be N rows for N attributes. The minimum row limit is neither a domain constraint nor a key constraint, which is why it fails on the DKNF test.

It is not necessary for every table to conform to DKNF, but if you are asking which term describes the column-per-attribute construction, I suggest that Domain / Key Normal Form should match.

+2


source


I follow this simple rule: one piece of data per column. Let the database optimize the storage structure.



In the SQL Server world so I use the BIT datatype for this and yes, separate columns. Other databases have a corresponding type, I'm sure.

+1


source







All Articles