How to reference groups of records in relational databases

Let's assume we have the following table structures:

People

| HumanID | FirstName | LastName | Gender |
| --------- + ----------- + ---------- + -------- |
| 1 | Issac | Newton | M |
| 2 | Marie | Curie | F |
| 3 | Tim | Duncan | M |

Animals

| AmimalID | Species | NickName |
| ---------- + --------- + ---------- |
| 4 | Tiger | Ronnie |
| 5 | Dog | Snoopy |
| 6 | Dog | Bear |
| 7 | Cat | Sleepy |

I wonder how to refer to a group of records in other tables.

For example, I have a Foods table and an EatenBy column.

Products

| FoodID | FoodName | EatenBy |
| -------- + ---------- + --------- |
| 8 | Rice | ??? |

What I want to store in EatenBy maybe

  • single entry in the tables "People", "Animals" (for example, Tim Ducan).
  • a group of entries in a table (for example, all dogs, all males, all females).
  • whole table (all people, for example)

A simple solution is to use a concatenated string that includes primary keys from different tables and special strings like People, M. The application can parse the concatenated string and act accordingly.

Products

| FoodID | FoodName | EatenBy |
| -------- + ---------- + -------------- |
| 8 | Rice | Humans, 6, 7 |

I know that using a concatenated string is a bad idea from a relational database point of view.

Another option is to add another table and use a foreign key .

Products

| FoodID | FoodName |
| -------- + ---------- |
| 8 | Rice |

EatenBy

| FoodID | EatenBy |
| -------- + --------- |
| 8 | Humans |
| 8 | 6 |
| 8 | 7 |

I think this is better than the first solution. The problem is that the EatenBy field stores different values. This is problem?

What is the best way to model this requirement? How to achieve 3NF in this case?

The table examples here are a little contrived, but I run into situations like this at work.

I've seen quite a few tables just using a concatenated string. I think this is bad, but can't think of a more relational way to deal with it.


Edit:

I have made several updates to the question and the question is getting bloated and unreadable.

So I have moved the updates to the individual answers below.


Quick response

For those who want to quickly find out the answer to this problem. Here are the main ideas:

  • Concatenated strings, CSVs, repeating groups should not be used in the relational model. Therefore, we need to normalize the table, excluding duplicate groups.
  • In this task, the concatenated string contains values โ€‹โ€‹of different values.
    Relational databases store facts about the real world. Each table should store facts about ONE subject and each column should store ONE fact .
    Thus, we need one associative table for each fact to fix the "EatenBy".
    e.g.
    Food_Human (FoodID, HumanID)
    Food_Animal (FoodID, AnimalID)
    Food_Species (FoodID, SpeciesID)

Caveat

Ideas for solving the problem are good. However, the data model here is terrible.

Obvious problems:

  • People are animals.
  • Using surrogates (identifiers) during data modeling is a bad practice.

For more information, you can read the answers in detail.

+3


source to share


2 answers


  • This answer is presented in chronological order. The issue has progressed in detail, marked as Updates . There are a number of overlapping answers .

  • Progressing from the original question to the final answer is a learning experience, especially for OO / ORM types. Major heading signs Answers, minor headings mark topics.

  • The answer exceeds the maximum length. I provide them with links to overcome this.

Answer to the initial question

As it stands, your question cannot be answered. So ... this is not an answer, these are directions so you can add relevant information to your question so we can provide an answer. Please do not vote for this.

You may have seen something like this at work, but that doesn't mean it was correct or acceptable. CSV violate 2NF. You cannot easily search this field. You cannot easily update this field. You have to manage content (e.g. avoid duplication, arrange) manually using code. You don't have a database or anything like it, you have a grandiose record-keeping system where you have to write mountains of code to "process". Just like the bad old days of ISAM data processing in 1970.

  • The problem is that you seem to need a relational database. You may have heard about data integrity, relational power (the power of the connection for you at this stage), and speed. The recording system has nothing like that.

    If you need a relational database, you have to:

    • Consider data relationally and use relational database techniques such as data modeling, data modeling, and nothing but data (not data).

    • Then classify the data (without any connection to the OO class concept or classifier).

    • Then the matching of the secret data.

  • The second problem is that this is typical of OO types, they concentrate, enforce data values, not data value; how it is classified; how it relates to other data; etc.

    No doubt, you didn't think that the concept of yourself, your "teachers" fed it, I see it all the time. And they love the record-keeping systems. Note that instead of specifying tables, you are specifying that you are specifying a "structure", but you are specifying data values โ€‹โ€‹instead.

    • If you don't appreciate what I am saying, let me assure you that this is a classic problem in the OO world and the solution is easy if you apply the principles. Otherwise, it's endless OO stack mess. I recently completely ruled out an OO + solution proposed by a very famous mathematician that supports monolithic OO. This is a famous article.

    • I relationalized the data (i.e. just put the data in a Relational context: simulated and normalized it, which took a total of ten minutes) and the problem went away, no suggestion + solution was required, Read the answer from Hidders . Note that I was not trying to destroy the paper, I was trying to understand the data that was presented in schizophrenic form, and the easiest way to do this is to create a relational data model. This simple act destroyed the paper.

    • Please note that the link is an excerpt from an official report on a paid task for a client, a large Australian bank, which has kindly granted me permission to publish the extract in order to educate the public about the dangers of ignoring relational database principles, especially by OO advocates.

    • The same thing happened with the second, more famous Kohler Response paper . This answer is much less formal, the client was not paid for the work. This idiot is talking about another abnormal "normal form".

Therefore, I ask you:

  • forget about "table structures" or definitions

  • forget what you want

  • forget about implementation options

  • forget the ID

    columns, completely and completely

  • forget EatenBy

  • think of what you have in terms of data, data value, not data values โ€‹โ€‹or example data, not what you want to do with it

  • think about how this data is classified and how it can be classified.

  • how the data relates to other data. (You might think yours EatenBy

    is, but it doesn't, because the data doesn't have an organization yet to create relationships.)

If I look at my crystal ball, most of it is dark, but from the little spots of light I can see it looks like you want:

  • Things

  • Groups of things

  • Relationships Between Things and ThingGroups

Things are nouns, objects. Eventually we will be doing something in between these items, these will be verbs or statements of action. This will form the Predicates (First Order Logic). But not now, as long as we want only Things.

Now if you can change your question and tell me more about your things and what they mean, I can give you a complete data model.


Hierarchy update 1 response

Record IDs are physical, non-relational

If you need a relational database, you need relational keys, not record IDs. Additionally, running a Data Modeling exercise with an ID marked on each file will cripple the exercise.

Read this answer .

Hierarchies exist in data



If you need full discourse, ask a new question. Here's a quick summary.

Hierarchies are found naturally in the world; they are everywhere. This leads to the fact that hierarchies are implemented in many databases. The relational model was founded and is a progression of the Hierarchical model. He maintains hierarchy brilliantly. Unfortunately, famous writers do not understand RM, they only teach in pre-1970s record storage systems labeled "relational". Likewise, they don't understand hierarchy, let alone the hierarchies maintained in RM, so they suppress it.

The upshot of this is that hierarchies that have to be implemented all over the place are not recognized as such and, therefore, are implemented extremely incorrectly and massively ineffectively.

Conversely, if the hierarchy that occurs in the data that is modeled is correctly modeled and implemented using real relational constructs (relational keys, normalization, etc.), the result is an easy to use and easy to use, database code. and also free from duplication of data (in any form) and very fast. It's literally Relational at its best.

There are three types of Hierarchy in the data.

  • Hierarchy formed in sequence of tables

    This requirement, the need for Relational Keys , is found in every database , and conversely, the absence of this distorts the declaration of the database, creates a system of writing records, while there is no consistency, power or speed of the relational database.

    The hierarchy is clearly visible in the form of a relational key that progresses in the process of compilation, in any sequence of tables: father, son, grandson, etc. This is important for normal relational data integrity, a type where Hidders and 95% of the database implementations do not.

    Hidders Response has a great Hierarchy example:

    and. which exist naturally in the data

    b. that OO types are blind to [since apparently hidders]

    from. they implement RFS without consistency and then try to "fix" the problem in the feature layers, adding even more complexity.

    While I implemented the hierarchy in classic relational form and the problem disappeared completely, eliminating the proposed "solution" document. Relational theory excludes theory.

    Two hierarchies in these four tables:

        Domain::Animal::Harvest
    
        Domain::Activity::Harvest
    
          

    Note that Hidders does not know that data is a hierarchy; that its RFS lacks integrity precisely because it is not Relational; that placing data in a Relational context provides the very integrity it seeks outside of it; that the relational model removes all such "problems" and makes all such "solutions" ridiculous.

    Here's another example , although the modeling is not yet complete. Be sure to study predicates and page 2 for actual keys. Hierarchies:

        Subject::CategorySubject::ExaminationResult
    
        Category::CategorySubject::ExaminationResult
    
        Person::Registrant::Candidate::ExaminationResult
    
          

    Note that the latter is a progression of the state of the business tool, so the key is not merged.

  • Hierarchy of rows within one table

    Typically a tree structure in some way, there are literally millions of them. For any given Node, this supports one ancestor or parent and unrestricted children. Done correctly, there is no limit on the number of levels or the height of the tree (i.e. Unlimited generations of ancestors and descendants).

    • The terms used by ancestor and descendant are simple technical terms, they have no OO connotations and limitations.

    You need server recursion to traverse the tree structure so that you can write simple proc and functions that are recursive.

    Here is one of the Messages . Please read both the question and the answer and visit the related Post Data Model . Note that the searcher did not mention hierarchy or tree because knowledge of hierarchy in relational databases is suppressed, but (from the comments), as soon as he saw the answer and the data model, he recognized it for the hierarchy that is, and that he quite fits. Hierarchy:

        Message::Message[Message]::Message[::Message[Message]] ...
    
          

  • Hierarchy of rows in one table, via an associative table

    This hierarchy provides an ancestor / descendant structure for multiple ancestors or parents. This requires two relationships, so an additional associative table is required. This is commonly referred to as the Bill of Materials structure. Unlimited height, recursively traversed.

    The problem with building materials was a limitation of a hierarchical DBMS, which we partially overcame in a network DBMS. It was a burning problem at the time, and one of the IBM problems that Dr EF Codd was clearly mandated to overcome. Of course, he met these goals and exceeded them significantly.

    Here is the Bill of Materials hierarchy , modeled and implemented correctly .

    • Sorry preamble, this is from the article, skip the top two lines, look at the bottom line.

    • Person :: Messenger is also provided.

    • Hierarchies:

      Part[Assembly]::Part[Component] ...
      
      Part[Component]::Part[Assembly] ...
      
      Person[Parent]::Person[Child] ...
      
      Person[Child]::Person[Parent] ...
      
            

Ignorance of hierarchy

Apart from the fact that hierarchies usually exist in data, that they are not recognized as such due to suppression, and therefore they are not implemented as hierarchies, when they are recognized, they are implemented in the funniest, ham-fisted ways.

  • Contact list

    Suppressors boldly assert that "The Relational Model does not support hierarchy", denying that it is based on the Hierarchical Model (each of which provides clear evidence that they do not know the basic concepts in RM that they claim to be a postulation). Therefore they cannot use the name. This is a stupid name they use.

    Typically, an implementation will recognize that there is a hierarchy in the data, but the implementation will be very poor, limited to physical record IDs, etc., no relational integrity, etc.

    And they don't know how to traverse the tree that needs recursion.

  • Nested sets

    Abortion, straight from hell. Recording system of records in the recording system. Not only does this generate a ton of duplication and violate normalization rules, this one captures records in the concrete feed system.

    Moving one node requires rewriting the entire affected part of the tree. Beloved Dates, Darwen and Celko.

    MS HIERARCHYID

    Datatype does the same. Gives you a mass of concrete that needs to be hammered in and poured again every time the node changes.

Okay, that wasn't that short.

Reply to update 2

Reply to update 3

Reply to update 4

+4


source


For each category that eats food, you must add one table. for example, if one food could be eaten by a particular gender, you would:

  Food_Gender(FoodID,GenderID)

      

for people you would have:

 Food_Human(FoodID,HumanID)

      

for animal species:



 Food_AnimalSpc(FoodID,Species)

      

for the whole table:

Food_Table(FoodID,TableID)

      

etc. for other categories

0


source







All Articles