What is relational integrity

In How to set up primary keys in a relationship , PerformanceDBA talked about Relational Integrity and pointed out that it is different from RI .

I have heard about Referential Integrity related to Foreign Keys .
But Relational Integrity seems strange to me.

In this question, Are relational integrity and referential integrity the same? , By Chris said that the two - one and the same.

The terms and definitions in the database world really confuse me. Can anyone explain Relational Integrity ?


Relational database theory has been established for decades since Edgar F. Codd proposed the Relational Model in his 1970 paper .
However, there is no agreed upon definition for normal forms, integrity, etc., in books or on the internet.
As a student, I am confused.

+3


source to share


3 answers


Confusion

I will answer your questions in a sequence that makes sense and reduces the length.

Relational database theory has been established for decades since Edgar F. Codd proposed the relational model in his 1970 paper. However, there are no agreed definitions for normal forms, integrity, etc. in ordinary books or on the Internet.

As a student, I am confused.

The terms and definitions in the database world really confuse me.

I would say this is the biggest problem in the database industry. There are two camps.

Codd Campaign

  • We adhere to the definitions of Dr EF Codd.
  • These definitions are not confused
  • These definitions are fully integrated with each other.
  • You don't need to be a theorist or "theoretician" to understand them. They are written in simple technical English and are supported by mathematical definitions.
  • Codd created the first relational algebra.

In other words, any qualified person can understand and use them (no need to read and understand mathematical definitions.)

High-performance RDBMS vendors are right in this camp. Along with their clients, they have chaired the SQL committee for implementing the features and capabilities of the relational model and SQL, the data sublanguage for decades (note that SQL is not a language, but a sublanguage for database access).

One result is the natural progression of the relational model. The features suggested by the opponents are "incomplete" and have been "completed" and implemented. Anyone who faithfully implements the Relational Model will see this as natural progressions, not as "extensions."

All small but important advances and progressions have been implemented by real theorists and scientists who work for high-end suppliers. (That is, neither myself, nor customers, nor vendors, all of whom have proposed such advances that have been implemented by vendors, indicate that the relational model was “incomplete” or that we “completed it.” We happily accept that such not "outside" the relational model.)

Non-SQL and pretend-SQL don't figure in this category.

"Theorist"

The days of true theorists and scientists are long gone. We now have people who exist in complete isolation from the industry they claim to serve; in the denial of other science (for example, the laws of physics, logic and common sense); and who write scientific articles that they cite and therefore elevate. Scientific articles and mathematical proofs are based only on the fact that in such denial and isolation, they prove what they are proposing to prove. The fact that a proof is pure rubbish in the real world, that it can only be "true" if the corresponding science is denied does not apply to them.

This denial of reality is schizophrenic. Unfortunately, these days they teach it at the university level.

So you see the papers raised solely because they have a lot of quotes (from other schizophrenics) and not on whether it is science or not.

For a person who was not "educated" to be schizophrenic, it is easy to destroy such documents. In another answer I gave you proof:

  • my Answer to Hidders (in which he suggests a "problem" in a "relational database" "what he suggests to" solve "and which" problem " disappears in the simple act of putting data in a relational context.

  • my Answer to Köhler (in which he proposes a "problem" in a "relational database" "he proposes to" solve "by creating yet another novel" normal form "and which" problem " disappears in the simple act of placing data in a Relational context.

The point separating the proof that the papers created by the "theorists" are based on the Straw Man method and therefore false is proof that they are unaware of the relational model they are talking about that needs to be written about.

This is a massive fraud because it is installed in the new education system.

Post-Codd-Gulag

This camp originally consisted of "theorists". In the forty-five years since the publication of the relational model, they have not produced a single thing that has moved forward or has moved forward.

It is now made up of all their followers and supporters. And because they wrote books that are now used as textbooks in university courses, all the "professors" are involved who teach this rubbish like parrots without testing the theory for themselves.

They have, however, produced a swamp of fragments that they claim are "relational". They break Codd definitions into tiny chunks and process each one separately from each other, either to attack Codd definitions or to maintain their own chunks.

  • For example. where Codd has clear and straightforward definitions for 1NF, 2NF, 3NF (the latter includes the definition of functional dependence), and any competent person can apply them, these creatures have:

    • six NF fragments ("1NF", "2NF", "3NF", "BCNF" (aka "3.5NF"), "4NF", "5NF")

    • which, taken together, do not even cover a fraction of what Codd's three NF codes cover.

    • so I said in other answers that if Codd's three NFs are understood and applied, in spirit and word, it covers the aforementioned six NF passages as well as any NF passages that were not written (by "theorists").

  • DKNF is the ultimate NF and is obviously the purpose of the relational model (to anyone really trying to understand it) but not defined as such. This is one of the natural progressions (above) and is obvious to the faithful performer. While I wouldn't say it is easy, it can be done: all my databases since 2007 are DKNF.

    • But the "theorists" have identified a tiny fraction of the DKNF and published it as "DKNF," and they adamantly state that this is not possible.

Of course, their fragments are incomplete, they forever find "holes" in their definitions and "fill" them. There is no end to "holes" and no end to "fillers". Codd definitions have no holes and do not need fillers.

And, of course, there is no integration of their many fragments. This is the very definition of de-integration.

They created a series of "relational algebras" (I can't keep counting) in competition with the Codd Relational Algebra. Of course, none of these come close, and Codd is strongly established as relational algebra .

They argue that the relational model is somehow "incomplete" and that their inventions "complete" it. They don't. Their inventions indicate that they do not understand the relational model they are proposing to "finish." It is part of a scam to raise their otherwise incredible inventions.

By doing all of this, of course, they introduced confusion. This is their purpose as detractors of the relational model. Their 42 or so "relational models" and "relational algebras" can only exist in a state of confusion where practitioners and seekers are confused about what a relational model is.

This answer is long, just because in order to answer your question, I must first clear up the confusion that the relational model is not , and secondly, confirm what the relational model is. If these creatures did not present all this fraud, it is confusion, if the Relational Model was clear and the definitions were not sabotaged, the answer to the question is simple and straightforward:

  • Relational integrity is a form of data integrity that is provided by the relational model that is not provided by record representation systems.

But for confusion, this is not enough, I must provide more details and evidence to eliminate the confusion.

And notice, they found it all a matter of opinion; argument; it is subjective. Of course, truth is objective, not open to opinion; discussion; or an argument. This is easy to prove: just read the relational model and see for yourself if something is defined and what that definition is.

Difference

The main difference between Codd Camp and the Gulag "theorists" is this:

  • The Post-Codd authors don't understand the relational model. Over the course of four decades, regardless of the fact that they did not add a single iota to the relational model, they created a private "relational model" (actually several), with their private definitions and private terms. The evidence for this is fourfold:

    • First, they only understand a small part of the relational model. They don't know (for example) Relational keys; Relational integrity.

    • Second, they propagate various inventions that are specifically prohibited in the relational model: (for example) surrogates; Access path dependency; circular links. Again, this is only possible if they don't know the relational model

    • Third, they have private definitions of terms that do not match the definitions in the relational model. This in itself ensures that they are disconnected from the industry and do not serve the industry that they claim. Moreover, they cannot communicate with any practitioner of the relational model.

    • Fourth, they have terms (and definitions) for falsifications that are not in the relational model. But they fraudulently declare such inventions to be part of the "relational model."

Overall, this means that what they practice and preach as a "relational" or "relational model" is far from relational, and in fact, by evidence, is Anti-relational .

  1. Since they do not understand the Relational Model, what do they understand that they are propagating?

    Well, from a mountain of evidence (i.e. their own books: books, textbooks, scientific articles, etc.) they understand and distribute only the technologies that we had before the relational model and RDBMS appeared: before 1970. ...

    • Who doesn't have Integrity (we'll go into that in detail) or the power or speed of relational databases (that is, that fit the relational model).

    • If someone were to ask for a description of the difference between a pre-relational DBMS and a relational DBMS, one sentence would be that in relational systems linked to records by record ID , without managing the contents of the records, whereas a relational system is linked to strings using a relational key with full control (integrity) of the content of the line.

The visible difference

The distinction indicated above is theoretical, understandable to people who have a theoretical basis (and are denied by "theorists"). However, the difference is obvious even to beginners, in two elements (there are many, I expose two really obvious ones), and here I can provide concrete evidence:

  • The relational model requires a Primary Key , which is then used as a foreign key to establish relationships. Detractors use the Record ID as a "primary key", which:

    • could not determine the definition of Key in the relational model. (The definition of the Relational Model keyword must be composed of data . Record IDs, GUIDs, etc. Produced, they are not data.)

    • implements an access path dependency that is specifically prohibited in the relational model. Path dependency has been a characteristic limitation of record representation systems.

    • (This leads to the need to navigate through each file between towing distal files, whereas in a relational model, two distal tables can be directly connected.)

    • (This, by the way, is proof that record systems actually require more, not less, JOINs, contrary to mythology.)

  • Thus, they will raise surrogates, record IDs, to the "key" status.

    • But this is pure fraud. In the relational world, there is no such thing as a "surrogate key" because the two words contradict each other, either it is a surrogate (non-key) or it is a key (not a surrogate).

    • In addition, using the term "surrogate key", it is natural to expect at least some, if not all, of the qualities of the Key, and the surrogate does not have any of the qualities of the Key.

    • This is "normal" in the world of "theorists", which is separate from the relational world because they don't have keys, they have surrogates as "keys"

  • They have their own invention together with their particular definition of "candidate key".

    • This invention hides the fact that they are not using the Primary Key as defined in the Relational Model, which itself is a violation of the relational model.

    • Again, this is pure scam, because there is no such thing in the relational model. So, whatever it is, it is outside the relational model

    • What is it? This is a key fragment. only the surrogate does not enforce row uniqueness , which is required in the relational model. And they need string uniqueness to establish a little bit of integrity in their record-keeping system.

    • Now, note that at this point to be complete, so the surrogate is always an optional column, i.e. in addition to data columns. It is never either, as many newbies (such as Fowler and Ambler, such as those who offer this "opinion" without reaching "consensus") suggest it should be.

    • This is a fragment of a relational key, because the candidate key is not implemented as a key in the record system, via files. It is only implemented in one file in which it is defined, and therefore is not used relationally . Whereas, in the relational model, such a key will be the primary key, and it will be the foreign key in every child of this table, i.e. it will be used relationally, through tables.

    • This leaves a fragment of the relational key that only exists in the file in which it is located.

  • Of course, candidate keys don't work. So they came up with another invention to make it work. "Superkey". It also doesn't work, and it requires a lot of duplication. It does not provide any of the qualities of a relational key. It only provides a snippet, a step above the failed candidate key, and thus the whole thing remains a failure.

  • Take functional dependency. Because of the confusion and sabotage introduced, we have to call it full functional dependency. Codd '3NF Definition also defines functional dependency. This is simple technical English and is easy to understand and implement.

    The most important thing to understand, since we are talking about a relational model, when Codd uses the term Key, it means a relational key. therefore the key must be defined and available first, before the Functional Dependency of Attributes on Keys is checked (this is a test), the second.

    • But saboteurs, demolitionists, invented fragments of the definition of 3NF. However, I won't go into details (unless I ask), however, if you study them, you will see that they serve one purpose: to raise your non-relational candidate key fraudulently to a key state.

    • Also, their entire set of snippet definitions (seven or so) that pertain to Codd's 3NF definition is complex, ordinary implementers cannot understand it, let alone implement it.

Summary. If anyone uses the terms {candidate key, superkey, partial / transient dependency [rather than full functional dependency]}, they identify themselves categorically as (a) not knowledgeable about the relational model, and (b) Anti-relational.

Result

The result, and this is a real achievement for "theorists", is that 95% of performers have implemented record-writing systems that lack the integrity, power, or speed of relational databases, but they think their RFS is "relational." It is a great pity that the "theoreticians" cannot recognize and enjoy their only achievement.

Theorists' announcements

This is what we must understand in order to penetrate the confusion and expose the falsities as such. If you realize that the "theorists" are heavily invested in their 42 or so "relational models" and "relational algebras" that they are very different from the relational model, you will realize that they are in fact completely ignorant of the relational model.

But that doesn't stop them from making statements about the relational model, what it does, what it can't do, etc. So don't believe any of the claims they make, it's like a pygmy that talks about airplane flying (see Gods must be crazy). In the meantime, there is no need to worry about it.

Question

Relational database theory has been established for decades

True. But only for the high level of the industry. Guys like me. The guys who have a solid foundation in theory and science, and who reject the non-science of the post-Codd era. The majority, 95%, were trained in the antirelation system.

In the question about how to set up primary keys in a relationship, PerformanceDBA talked about relational integrity and pointed out that it is different from referential integrity.

Yes.

Can anyone explain relational integrity?



Yes. But this can only be done by a true Relational Practitioner.

Be warned due to the state of the relational database industry, confusion introduced, massive fraud committed as described above, saboteurs and demolitionists will say: either there is no such thing, or there is, but it cannot be implemented, or that it is the same as referential integrity.

  • This will once again prove two separate things (a) that they do not know the relational model and (b) that they are anti-relational.

I've heard about referential integrity, which is related to foreign keys. But relational integrity seems strange to me.

In this question, Are relational integrity and referential integrity the same? "Chris said the two are one and the same.

No, it is not.

As stated above, this statement proves that he is unaware of the relational model and that he is anti-relational. Thus, he cannot know what the relational model is, what the relational integrity is. They don't know what they are missing, so they cannot describe or define it.

I can.

Let's start with referential integrity so we know (a) what it is and (b) how it differs from relational integrity.

We need a decent example to work with. Note that scammers and thieves use simple examples because anything can be proven (or disproved) with simple, trivial examples. A deep understanding requires complete examples that are "complex" enough to demonstrate the problem.

Let me use the example I gave the theorists at comp.databases.theory. They couldn't work it out. I have given pointers and hints. They still couldn't solve it.

  • This in itself is proof that "theorists" cannot normalize anything . Despite the fact that they have 17 mathematical definitions for their abnormal, fragmented "normal forms".

  • We really have to shout that from the rooftops. They cannot tell the practitioners anything.

Here's a typical implementation by a developer who has read and followed detractors' books. As usual, he thinks it is "relational". But it is not relational at all, it is a record-writing system that lacks the integrity, power, or speed of a relational database.

The content should be familiar to everyone, so I'll skip the description. Note that there are ISO and ANSI / FIPS standard codes for the first three levels, for example. ISO-3166-1, 3166-2 and FIPS.

  • Typical Record Registration System Implementation declared as "relational"

    • The developer learned about the uniqueness of strings and implemented alternate keys to ensure this. They're more advanced than they usually do, but hey I'm not trying to argue the straw man, these are the best candidate keys anyone has come up with. For example. in the state file, it correctly states that the Name and StateCode must be unique within the country.
  • But its "primary keys" are not primary keys, they are record identifiers.

  • The developer declared this set of files as "5NF compliant". The "theorists" conveyed this as such.

As for Codd and I, (a) it fails 3NF and (b) it fails Relational. But we won't deal with that here, we just need a good example to use for our purpose, relational integrity.

Take a look at the DDL for the Country file.

    CREATE TABLE Country (
        CountryId     INT       NOT NULL  IDENTITY PRIMARY KEY,
        CountryCode   CHAR(2)   NOT NULL,
        Name          CHAR(30)  NOT NULL,
        CONSTRAINT AK1 UNIQUE ( CountryCode ),
        CONSTRAINT AK2 UNIQUE ( Name )
        )
    INSERT Country VALUES
        ( 'US', 'United States of America'),
        ( 'CA', 'Canada' ),
        ( 'AU', 'Australia' )

      

So far so good. Take a look at the state file.

    CREATE TABLE State (
        StateId    INT       NOT NULL  IDENTITY PRIMARY KEY,
        CountryId  INT       NOT NULL,
        StateCode  CHAR(2)   NOT NULL,
        Name       CHAR(30)  NOT NULL
        CONSTRAINT AK1 UNIQUE ( CountryId, StateCode ),
        CONSTRAINT AK2 UNIQUE ( CountryId, Name ),
        CONSTRAINT Country_ConsistsOf_State_fk
            FOREIGN KEY        ( CountryId ) 
            REFERENCES Country ( CountryId )
        )
    INSERT State VALUES
        ( 1, 'AL',  'Alabama' ),  
        ( 1, 'GA',  'Georgia'),
        ( 1, 'NY',  'New York'),
        ( 2, 'NT',  'Northwest Territories'),
        ( 3, 'NT',  'Northern Territory')

      

Note that (for example) both Canada and Australia have StateCode "NT" and alternate keys allow this. But also note that when inserting states, we are forced to use the record ID instead of the data to identify the parent country of the inserted state. This should trigger the alarm bells.

Still so common. Let's take a look at the parish file.

    CREATE TABLE County (
        CountyId    INT       NOT NULL  IDENTITY PRIMARY KEY,
        StateId     INT       NOT NULL,
        CountyCode  CHAR(2)   NOT NULL,
        Name        CHAR(30)  NOT NULL
        CONSTRAINT County_UK1 ( StateId, CountyCode ),
        CONSTRAINT County_UK2 ( StateId, Name ),
        CONSTRAINT State_IsMadeUpOf_County_fk
            FOREIGN KEY              ( StateId ) 
            REFERENCES State ( StateId )
        )
    INSERT County VALUES
        ( 1, 'LE', 'Lee' ),  
        ( 2, 'LE', 'Lee'),
        ( 3, 'LE', 'Lee')

      

When inserting counties, we are forced to use the record ID instead of the data to identify the parent state of the county that is being inserted. This should trigger more wake-up calls. Note that (for example) America has County Lee in 12 states, but not New York, and Canada does not.

Oops, we just brought Lee County to New York State. This opens up two important issues:

  • Both from the point of view of the user and with the application that the user uses to store data in the filing system, there is very little data because there is very little data Identification . (Record IDs are not data, they are not even visible to the user).

    So even if the app provided the record id for the state before it determined that it should have asked the user "which state" was the county populated and provide a drop-dow that (hopefully) contains StateCodes (minimum) or state names (better) alphabetically.

    And then retrieve the record ID for the selected state, leaving the StateCode behind.

  • From a purely data perspective, it is quite possible to define a county and identify it uniquely (CountryCode-or-Name, StateCode-or-Name, CountyCode-or-Name).

    But in the system "Theorists" we are not allowed.

So what's the problem?

The problem is that "theorists" do not understand relational keys or identifiers (IDEF1X term). Because they don't understand this, they don't know what they are missing.

The solution is a relational model with relational keys.

Referential integrity

First you need to understand something. What is a Primary Key? In the context of SQL, i.e. implementation, using the keyword PRIMARY KEY

does not magically convert the column named as such to the primary key (according to the relational model). It just enforces uniqueness on the named column.

What is referential integrity?

It's just the use of keywords FOREIGN KEY ... REFERENCES

in SQL. This does not magically enforce relational integrity. It only enforces on the server (or not server for NON-sqls) which FK identified in the referenced file is the PK in the referenced file.

Stop here for a moment and take stock. Theorists and post-codd writers only know RFS and very little SQL. They only know referential integrity (and basically they don't even implement that, but don't get distracted). Thus, they are unable to make statements, one way or another, as to what they do not know.

Relational integrity is much more than referential integrity.

Relational Sets

Dr. EF Codd asks us to think of data in terms of sets. The "theoreticians" do not understand this. They only know two sets: a set containing the entire file; and an empty set.

There are many more possibilities in the relational world than this.

  • At the first level, a country, of course, there is only one set, Countries.

  • At the second level, the state, a simple set - these are all states of all countries. But there is a more relational set we can work with:

    • each set of states belonging to one country, state-countries.
  • At the third district level, a simple enrollment is all the districts of all countries. But there are more Relational sets we can work with:

    • each set of counties belonging to the same state, in one country, country-state-counties.

    • and of course every set of counties belonging to a Country, Countries.

So, while the relational model provides what is not implemented in an RFS, and the integrity that is possible in such an RFS is limited to referencing one file at a time, rather than binding to datasets.

Relational integrity

Now let's look at the integrity that the relational model provides.

To provide a comparison, we need this example converted to a relational database. Look only at the first three tables on page 2.

Look at the same inserts:

    INSERT Country VALUES
        ( 'US', 'United States of America'),
        ( 'CA', 'Canada' ),
        ( 'AU', 'Australia' )

    INSERT State VALUES
        ( 'US', 'AL',  'Alabama' ),  
        ( 'US', 'GA',  'Georgia'),
        ( 'US', 'NY',  'New York'),
        ( 'CA', 'NT',  'Northwest Territories'),
        ( 'AU', 'NT',  'Northern Territory')

      

Here we say: identifier or relational key for the state (CountryCode, StateCode). There are no alarms for StateCode "NT" because CountryCode makes it unique.

    INSERT County VALUES
        ( 'US', 'AL', 'LE', 'Lee' ),  
        ( 'US', 'GA', 'LE', 'Lee' )

      

Here we say that the identifier or relational key for the county (CountryCode, StateCode, CountyCode). We are unlikely to make a mistake in locating Lee County in New York State, because when we enter it, we know it is wrong, we must actively enter the wrong thing "NY". Whereas when it was a number, an ID field, we had no idea:

    INSERT County VALUES
        ( 'US', 'NY', 'LE', 'Lee' )

      

To sum this up in relational terms:

  • In record registration systems sold and distributed by post-Codd authors and "theorists" who have no relational keys; no Sets, they only have the referential integrity provided by SQL.

    • The county is limited to any entry that exists in the state file.
  • In a relational database that has relational keys and sets, in addition to the referential integrity provided by SQL, we have relational integrity.

    • The county is limited to each specific country state combination that exists in the state table.

Answer Exceeds SO limit

I have a second example and additional explanation to complete this. But the answer is too long. I have to convert it to a WP file, put it in a link, etc. Tomorrow.

+5


source


The state of the database represents the situation in the world. Due to the fact that we have to put in specific base tables of the database relative to the world, and since only some world situations can occur, only some database states can arise. It would be a mistake if there was another state in the database.

Database integrity is about making sure that the database never contains invalid state. (For example, you'll see a tutorial section titled "Integrity" or "Database Integrity" addressing this question.)

"Relational integrity" is not a general term. I expect it to reference the database integrity of relational databases. Constraints are a collection of descriptions of the actual states of relational databases. A relational DBMS allows you to declare and use them.



It can also be wisely used to indicate adherence to the principles of the relational model.

Referential integrity, a subtopic of relational database integrity, is to make sure that the database never contains invalid type state, described (depending on the context) by either a foreign key constraint or SQL FOREIGN KEY constraint that we can call an external supercluster ... (SQL FOREIGN KEY definitions can include references to superclasses, not just candidate keys as a foreign key constraint.)

What really matters is the underlying concepts of the relational model. General and special conditions just let us refer to them.

+2


source


No doubt he assumed that the term ("relational integrity") meant "100% fidelity to the original model developed by Codd." This is a rare property. Don't look for it in SQL systems. Just remember that this is not an art term, let alone a consistent one.

0


source







All Articles