How to model multilingual objects in relational databases

If we are going to develop a multilingual application, will we store the translations in resource files or a database ?

Let's say we decide to do this in the database. Is there a standard way to model multilingual objects in the Relational Model ?

1. One large translation table

We can keep all translations in one table and use language neutral keys for attribute values.

Face ( SSN , FirstName, LastName, Birthday)

Translation ( key , langid , translation)

2. One translation table for each object

Face ( SSN , Birthday)

PersonML ( SSN , LangId , FirstName, LastName)

I prefer this approach. It really is a 1: N ratio .

Problem

It seems multilingual columns cannot be used to form the Primary Key .
Assuming each person has a unique name, then (FirstName, LastName) can be used as the primary key.

Face ( FirstName , LastName , Birthday)

However, due to multilingualism, (FirstName, LastName) cannot identify a person.
Apparently we cannot add a LangId to form the primary key.

Face ( LangId , FirstName , LastName , Birthday)

In this case, one person will be stored on multiple rows and non-key columns will be duplicated.

Should I use columns with neutrality for primary keys?
When there are no such columns, will we use a surrogate ?
I was told that surrogates should not be used blindly and I strongly agree.


Update 1

For this example, I am assuming that FirstName and LastName are localized.

If there is always some attribute like SSN for every entity, the second approach makes sense.
However, some valid primary keys may become invalid if they contain columns to be localized.

Another example

Each company has a unique name, so companyname can be used as the primary key.

Company ( CompanyName , ...)

When it comes to localization, the company name cannot be used as a primary key. We have to come up with a code to represent the company.

Does this mean that localization is not appropriate in the relational model?


Update 2

3.1: N Relationship between default language and other languages

Users can perceive the company table as:

Company ( CompanyNameEnglish , CompanyNameFrench, CompanyNameSpanish, ...)

Of course there are repeating groups, so it splits 1NF.

Improved:

Company ( CompanyNameEnglish , ...)

CompanyNameML ( CompanyNameEnglish , LangId , CompanyName)

The problem is that we have to provide a standard (English) name, even if not required by the user.
Some users may provide English names, others may only provide French names.
Is this requirement too contrived?

4. Support for DBMS localization

PerformanceDBA talked about this in his comment.
I will do more research on this.

+1


source to share


2 answers


"I was told that surrogates should not be used blindly, and I strongly agree."
I also agree that using something blindly is never a smart choice.

However, not every time you use a surrogate key, it is done blindly. Keep in mind that the primary key is not the only way to enforce uniqueness. Most, if not all relational databases offer unique constraints and unique indexes, and should be used wisely. In fact, when storing multilingual data in translation tables, using a surrogate key may be better than using a natural one. read this article for a good comparison between natural and surrogate key strategies.



To answer your question, I would go with a translation table for each entity, storing only the non-text data entity in the primary entity table (e.g. date of birth and gender in your face example) and storing the text data in the translation table, having a primary key consisting of from the language id and the primary key of the entity table.
Note that the primary key of the entity table in this case must be non-textual and not language dependent.

+1


source


Your question is moot: if you decide to save your translations in "resource files", then this set of IS resource files is the database (or part of it).

More relevant questions to answer, eg: who owns the translations (i.e. are they an integral part of the software package or not - can users customize them)? The answers to questions such as those will affect whether your translations can act like resource files that come with binaries or not.



I am not giving any real answer, because the only one who can know the deciding factors is you. I'm just pointing out the deeper questions that need to be answered to make it clear.

0


source







All Articles