Database design problem
I am having a problem creating a database schema for the following scenario:
(I am not building a dating site, but just using this as an example)
A user registers on a dating site and receives multiple selections for hair color, for example, their date:
It's simple enough to simulate the three tables below:
Tables:
User {Key}
HairColour {Key}
UserHairColour {User Key} {HairColourKey}
However, the user also has the option to select "any", which means they don't care about the hair color and the entire hair color must be included in the selection.
How do you give the user "any option"?
I could select all hair colors and tuck them into UserHairColour, but what if I need to add a new hair color in the future?
source to share
The absence of any entries for that particular user in the UserHairColour table indicates that they don't care about hair color.
No solution means they have no preference. Obviously, that doesn't mean they want their date to have no hair color.
I see no need for a separate value or additional table design here. What you have allows you to achieve your goal in a simple way.
EDIT: As a reaction to the proposed solution with ANY additional value.
The idea of ββ"ANYONE" would conceptually interfere with other choices. We're talking about presenting the user with a variety of options, ANY one that is one of them, and allowing them to choose from many. Thus, the user can technically select ANY along with other parameters, making it unclear which one has the advantage - ANY or specific options. I believe that the approach with just no entries as an ANY indicator is clearer - it can only be interpreted in one direction. No entries - no preferred values. You obviously can't interpret it any other way - there is no preferred value - the user doesn't want this value to be present - this will make the hair color transparent, which doesn't make sense. You can tell that this could mean no hair at all,but I would suggest having a separate option or a separate question for this already.
Given the above example, I would just add "Any" or "No Preference" as a choice and treat it as a specific hair color. This will work best because if you want to add more specific hair colors. Usually, when I create new relational models, I usually add -1 for the first key record and store the values ββfor that row, since I go to it by default. This would be better than just a dummy with a temp table or a query in my opinion.
source to share
It should be simple. If the user chooses "Any", you simply process it on demand:
select
*
from
User
left join
UserHairColour on UserHairColour.UserId=User.UserId
where
(@hairpreference = 'Any' OR UserHairColour.HairColourId=@hairpreference)
If you can set var @hairpreference input to null instead of "Any" then it becomes easier:
where
(UserHairColour.HairColourId=COALESCE(@hairpreference, UserHairColour.HairColourId))
source to share
Declare a temporary table, fill it with color values, and query it like this:
SELECT *
FROM UserHairColor
JOIN User
ON User.id = UserHairColor.UserID
WHERE HairColorKey IN
(
SELECT ColorKey
FROM @mytable
)
UNION ALL
SELECT *
FROM UserHairColor
JOIN User
ON User.id = UserHairColor.UserID
AND NOT EXISTS
(
SELECT NULL
FROM @mytable
)
This will select all users with the requested hair colors, all users if the table is empty.
source to share
Place (PersonID, HairColorPreference) in your own table. If someone has no preference, just don't write a row to this table.
Use views to connect people with preferences with those preferences and with people with no preferences in all hair colors.
By the way, what are you going to do with people whose preference is "nothing but purple"?
source to share
As is clear, you are not going to build a dating site that you can clarify if the other answers here suit your needs or not. But my suggestion is to create another table to see if the user has selected any hair color without hair color (sounds silly in your example, but might make sense in another situation). With the following tables in your database, you can accomplish this.
- Users
- Haircolor
- TypeOfColorSelection (1: Selected, 2: All, 3: Exclude, ...)
- UserColorSelectionProfile (UserID, TypeOfColorSelection)
- UserPreferredColor (UserID, HairColor)
It reminds me of a classic British TV commercial for Whiskas for Cats. The original trapedia was
Eight out of ten owners say their cat prefers it
It was later changed to
Eight out of ten owners who expressed a preference said their cat prefers it
[Italics mine.]
Obviously, the results are skewed when it fails to show the difference between an implicit explicit lack of preference, otherwise why change a good strapline for one that doesn't scan at all? QED;)
My preference would be to use separate tables to model those who expressed preference (along with the color (s) they chose), those who expressed that they had no preference, and those who did not.
For a worked example, see How to handle missing information without using NULLs from Hugh Darwen.
source to share