A join between two many-to-many tables

Question

A join between two many-to-many tables

I have some troubles in determining the appropriate database design for my Java based web application. I have 3 tables (tag, DT and property) and my scenario looks like this. Each tag can contain many Dts, and each DT can be assigned to different Tags, each DT has many properties, and each property can be used in many DTs. When a DT is assigned to a tag, the user can set values for all of their properties.

My question is how can I define a relationship so that each tag can have its own values on the DT that is assigned to it. what are the missing tables and relationships in my design. Then I have to create relative entities based on this db.

Tag1==> DT1 ==> Initial values1, property values set 1 ; 
Tag2==> DT1 ==> Initial values1, property values set 2 ;

Here is the db diagram. And thanks for your help in advance.

enter image description here

Here are some sample data

enter image description here

+3

java jpa database-design surrogate-key

sina 09 oct. 14 at 11:49

source to share

2 answers

The DB design is totally fine.

Just use the Tag_DT table and DT_Property.

Below is an example request

select a.id_tag,b.id_Property
from 
tag_dt a,dt_property b
where a.id_dt = b.id_dt;

+1

Nishant modi 09 oct. 14 at 12:42

source to share

vyegorov · Accepted Answer · 2014-10-10T21:32:24+0000

The central problem with your design, as I see it, is the use of surrogate keys. It is not necessary to always create numeric single column keys for tables. And even if you do this, it does not guarantee that you will not have duplicates. This essentially forces the system to keep more indexes on your tables, which is additional work to be done.

Some concepts:
- I use a single value for the table names because each tuple in the table represents one relation object;
- I use lowercase letters for table and column names (identifiers) and uppercase letters for keywords. I don't like CamelCase in databases;
- I am using <table_name>_id
  
  PK for columns. This is also possible because of the singularity of the table names;
- I am using prefix + table name + details for all constraints and indexes I create.
You don't have to follow them, but it would be nice to stick to consistent naming patterns for your designs.
I would start with a dictionary property_type

:
```
CREATE TABLE property_type (
    property_type   varchar(20) NOT NULL,
    CONSTRAINT p_property_type PRIMARY KEY (property_type)
);

      

        
        
        
      

    
```
It is a single column table that only exists to provide the domain of a possible value for property types. I used varchar(20)

, text column is pretty good. The advantage of this is that you don't need to join this table with numeric keys to get the value property_type_id=123

.

Properties:

CREATE TABLE property (
    property_id     integer     NOT NULL,
    property_name   varchar(50) NOT NULL,
    property_type   varchar(20) NOT NULL,
    CONSTRAINT p_property PRIMARY KEY (property_id),
    CONSTRAINT u_property_name UNIQUE (property_name),
    CONSTRAINT f_property_type FOREIGN KEY (property_type)
      REFERENCES property_type ON UPDATE CASCADE
);

I decided to go with a numeric PK here, as I am assuming that you will want to rename properties at some point. And if you change property_type

, the updates will be cascaded.

Despite the fact that a PK already exists here, the restriction UNIQUE

on names is mandatory here, otherwise you risk being in a situation where you have the same named properties with different identifiers.

DT table:

CREATE TABLE dt (
    dt_id       integer     NOT NULL,
    dt_name     varchar(50) NOT NULL,
    CONSTRAINT p_dt PRIMARY KEY (dt_id),
    CONSTRAINT u_dt_name UNIQUE (dt_name)
);

Again, a simple PK isn't enough by creating a constraint as well UNIQUE

. Although I would very completely get rid of dt_id

and leave only dt_name

and do it PK.

Properties for DT:

CREATE TABLE dt_property (
    dt_id           integer     NOT NULL,
    property_id     integer     NOT NULL,
    initial_value   varchar(50) NOT NULL,
    CONSTRAINT p_dt_property PRIMARY KEY (dt_id, property_id),
    CONSTRAINT f_dt_id FOREIGN KEY (dt_id) REFERENCES dt,
    CONSTRAINT f_property_id FOREIGN KEY (property_id) REFERENCES property
);

This is the first big difference from your design - a composite key is used.

Yes, this means that you will need to execute 2 columns whenever you want to refer to the entry in this table. But it doesn't matter, in fact - you design table once, you also write your queries once, but your software can be used for quite a long time if executed correctly and easily maintained. Better to spend a little time entering queries and maintain systems easily in the long run.

Tags:

CREATE TABLE tag (
    tag_id      integer     NOT NULL,
    tag_name    varchar(50) NOT NULL,
    CONSTRAINT p_tag PRIMARY KEY (tag_id),
    CONSTRAINT u_tag_name UNIQUE (tag_name)
);

It's just another dictionary. Again, just like a table dt

, I would really like to avoid using a column tag_id

and just tag_name

make it PK as well.

The new table tag_dt

presents:

CREATE TABLE tag_dt (
    tag_id      integer     NOT NULL,
    dt_id       integer     NOT NULL,
    CONSTRAINT p_tag_dt PRIMARY KEY (tag_id, dt_id),
    CONSTRAINT f_tag_id FOREIGN KEY (tag_id) REFERENCES tag,
    CONSTRAINT f_dt_id FOREIGN KEY (dt_id) REFERENCES dt
);

This table is required to create dt

+ relationships tag

. Without this, you have data duplication - you can see this in your diagram, you have 2 lines with Tag_name='Tag1'

.

Finally, the properties of the tag:

CREATE TABLE tag_property (
    tag_id      integer     NOT NULL,
    dt_id       integer     NOT NULL,
    property_id integer     NOT NULL,
    a_value     varchar(50) NOT NULL,
    CONSTRAINT p_tag_property PRIMARY KEY (tag_id, dt_id),
    CONSTRAINT u_tag_property UNIQUE (tag_id, property_id),
    CONSTRAINT f_tag_property_tag FOREIGN KEY (tag_id, dt_id) REFERENCES tag_dt,
    CONSTRAINT f_tag_property_property FOREIGN KEY (dt_id, property_id)
      REFERENCES dt_property
);

This table contains complete composite keys and meets all your requirements. Primary key tag_id, dt_id

as well as foreign key to the table tag_dt

so you want to represent something that hasn't been previously defined. It tag_id, property_id

is then unique, so the properties of the tag cannot be duplicated. Finally, it dt_id, property_id

refers to a table dt_property

, which means that only properties that are allowed to do so will be registered dt

.

Final Notes

All primary and unique keys are implemented through indexes in most DBMSs. In addition, some DBMSs may use composite (multi-column) indexes even if the first key column is omitted. At least PostgreSQL can do this, which is what I use the most.

Please take a look at the tables dt

and tag

I highly recommend getting rid of the key surrogate in them like property_type

.

I have not created any additional indexes, generally I do this exercise after the Data Model is implemented and some real queries are made against it.

Also, don't use value

or name

for column names. These are reserved words, and you may have unexpected consequences in future incarnations of your DBMS of choice.

A join between two many-to-many tables

Final Notes

More articles: