Comparison of performance of surrogate and complex keys

If the database has attributes A1, A2, A3 ... An and A1, A2 and A3 can create a composite key together , is it better to use a surrogate key instead of a composite key?

Using a surrogate key will improve the speed of performing record inserts (this supports composite key surrogate). But SELECT, UPDATE and DELETE queries based on the A1, A2 and A3 attributes will be significantly slowed down if we use a surrogate key (this supports a composite key over a surrogate key).

What's better in terms of performance under these conditions? Surrogate key or composite key?

+3


source to share


2 answers


In almost all tests, the advantage of surrogate keys over natural keys was negligible. Natural keys also have the advantage of being much easier to operate. The best recording is available here .



+3


source


Performance is not the primary concern of deciding whether to implement a surrogate primary key.

We find that the ideal primary key has several desirable attributes

  • simple (single column, native datatype)
  • unique (positively non-duplicate values)
  • not null (every line will have a value)
  • immutable (once assigned, it never changes)
  • anonymous (carries no information)

There is no "rule" that a candidate key chosen as the primary key must have all of these properties, but these are the properties that are desirable for various reasons.

There is not even a "rule" that says that all tables must have a primary key. But we consider it desirable.

Successful software systems have been created using surrogate keys as well as natural keys.




From a performance standpoint, there really aren't many differences that can be demonstrated. But consider this: if an entity table has a primary key that is a composite key made up of several "large" columns, the same large columns must be repeated in any table that has a foreign key reference to that entity table and in some storage systems (InnoDB), they are repeated at every index.

But performance isn't the deciding factor. (Anyone suggesting that performance should be the deciding factor when choosing a candidate key, as the primary key hasn't thought about it enough.)




Because it is "easier to work with," many developers find it easier to use a single column as the primary key, or a combined key of two, three, or more columns.

Some developers who chose natural keys as the primary key were later burned by choosing a candidate key. Not because it was a natural key, but because further development of the "new" requirements was "discovered" and it turned out that the candidate key they chose as the primary key was not actually always unique, or that it wasn't exempt from change, or that it was not truly anonymous.

There are many software projects that have successfully used natural keys and compound keys with PRIMARY KEY. Just like there has been success using a surrogate key as PRIMARY KEY.

+1


source







All Articles