Single lookup table of large SQL server

Question

Single lookup table of large SQL server

I have a SQL Server 2008 database with a snowflake style schema so there are many different lookup tables like Language, Countries, States, Status, etc. These lookup tables all have almost identical structures: two columns, code and decode.My project manager would like all these different tables to be one BIG table, so I would need another column like CodeCategory and my primary columns for this large table would be CodeCategory and Code. The problem is, for any of the tables that have actual code (like language code), I cannot establish a foreign key relationship in that big decoding table, since the CodeCategory will not be in the fact table, but just in the code. And the codes by themselves will not be unique (they will be in the CodeCategory),so I can't do FK only from the code field of the fact table to the code field of the Big lookup table.

So am I missing something, or is it not possible to do and still be able to do FK on linked tables? I wish I could do this: I have an FK where one of the columns I matched in the lookup table would match a string constant. Like this (I know this is not possible, but it gives you an idea of what I want to do):

ALTER TABLE [dbo].[Users]  WITH CHECK ADD CONSTRAINT [FK_User_AppCodes] 
FOREIGN KEY('Language', [LanguageCode])
REFERENCES [dbo].[AppCodes] ([AppCodeCategory], [AppCode])

The above doesn't work, but if it did, I would need FK. Where I have the "Language" string, is there a way in T-SQL instead of substituting the table name from the code?

I absolutely need FKs, so if nothing like this is possible, then I'll have to stick with my small lookup tables. any help would be appreciated.

Brian

+3

database sql-server sql-server-2008

user3772397 06 jan. 15 at 14:59

source to share

2 answers

Solomon rutzky · Answer 1 · 2015-01-06T16:48:06+0000

It is impossible to do this, but it is impossible to do it, and not damage the system at several levels.

While a single lookup table (as already pointed out) is a really terrible idea, I will say that this template does not require a separate PK field or autogenerate it. This requires a composite PK consisting of ([AppCodeCategory], [AppCode])

, and then the fact table must have BOTH fields that will have a composite FK of both fields back to PK. Again, this is not an endorsement of this particular end goal, just a technical note that it is possible to have composite PCs and FKs in other more appropriate scenarios.

The main problem with this type of constant approach is that each constant is truly its own thing: languages, countries, states, Statii, etc. are all completely separate entities. Although their structure in the database is the same (as of today), the data in this structure does not represent the same things. You would be tied to a model that either prohibits adding additional search fields later (for example, ISO codes for language and country but not others, or something that pertains to states that do not apply to others), or would require adding fields NULLable with no way of knowing which category they apply to (enjoy the debugging questions associated with this and / or explain to a new person - who has been there for 2 days and is tasked with writing a new report),that the 3-digit ISO country code does not apply to Deleted status).

This approach also requires you to store a custom Category field in all related tables. And this is for searching. Therefore, if the fact table has CountryCode

, LanguageCode

and StateCode

, each of these FKs gets a corresponding CategoryID field, so now there are 6 fields instead of 3. Even if you could use TINYINT

for CategoryID, if your fact table has even 200 million rows, then these three additional 1 byte fields are now 600MB, which has a negative impact on performance. And don't forget that backups will take longer and take up more space, but the disk is cheap, isn't it? Oh, and if the backup takes longer, then the restore will take longer too, won't it? Oh, but does the table have closer to 1 billion rows? Better: -).

While this approach looks perhaps "cleaner" or "simpler" now, it is ultimately more costly, especially in terms of wasted developer time, as you (and / or others) will try to work around the issues in the future. poor design choices.

Has anyone even asked your project manager what its purpose is? This is a reasonable question if you are going to spend a number of hours making changes to the system so that the claimed benefit is wasted in that time. This of course does not make interacting with the data any easier, and it will actually make things more difficult, especially if you choose a row for "Category" instead of TINYINT

or perhaps SMALLINT

.

If your PM is still pushing this change, then as part of this project, you also need to change any enum

in the application code to match what is in the database. Since the database has its values grouped together, you can accomplish this in C # (if your application code is in C #, if not, translate to whatever is appropriate) by specifying the values enum

explicitly using the first X digits pattern are " categories "and the remaining Y-digits are" value ". For example:

Assuming the category "Country" == 1 and the category "Language" == 2, you can do:

enum AppCodes
{
  // Countries
  United States  = 1000001,
  Canada         = 1000002,
  Somewhere Else = 1000003,

  // Languages
  EnglishUS = 2000001,
  EnglishUK = 2000002,
  French    = 2000003
};

Absurd? Completely. But it is also similar to query to combine all lookup tables into one table. What's good for a goose is good for a gander, right?

Rikalous · Answer 2 · 2015-01-06T17:03:48+0000

Is this suggested so that you can minimize the number of admin screens that are required for CRUD operations on your persistent data? I've been here before and decided it was better / safer / easier to create a generic screen that uses metadata to decide which table to pull from / write to. It was a little more work to build, but kept the database schema "correct".

All persistent data tables had the same basic structure, they were mainly for the dropout population with random additional fields for business rule purposes.

Single lookup table of large SQL server

More articles: