Can an XML column be used to store additional data?

If everything is ok and it is recommended to use an xml column to store additional data that the UI can provide?

For example, suppose the Employee table

CREATE TABLE Employee
(
    EmployeeId int not null,
    Name nvarchar(300) not null,
    Phone varchar(30) null,
    Email varchar(320) null,
    Address nvarchar(max) null,
    Data xml null
)

      

Data

can contain many values ​​such as additional phone numbers, comments ...

We expect all our clients to request different fields in Employee

, and we don't want to mess with the database structure every time they think about adding a new field.

We expect the data stored in the xml column to rarely access the data. In addition to being viewed in the employee list, the data can be printed. Therefore, we need to preserve the datatype across the data (a bit like a dataset that serializes its data)

Is this a good approach for storing unknown additional data?

Edited Give a better example

Update I haven't picked an answer yet because I am discussing with my team a different approach that you guys suggested.

+2


source to share


9 replies


You can do it like this:

[EmployeeField]
-----------------------------------------------------
EmployeeID    EmployeeFieldName    EmployeeFieldValue

      



In fact, the same approach .NET Membership uses for user profiles with dynamic fields. Many line-of-business applications use the same approach to store dynamic client inputs.

Written in a comment by Spencer Ruport, this approach is known as the Attribute-Attribute Model (EAV) .

+5


source


You are cutting yourself down on your knees using the relational database model. Relational databases are based on the idea of ​​a static data structure and can be queried easily and quickly. Even accepting the EAV structure as "new in the city" suggests a contradiction, although it is arguably better than simply dumping XML data into a column.



If customer

it's weird and the rest of the data is fine, then it's probably ok, although I would definitely go with the EAV approach. If most of your tables look like this, then it's time to rethink your approach to data storage.

+2


source


An XML column is a very flexible format, but also very poorly searchable. The user "New in the city" indicates another standard solution, less flexible, but with the ability to choose and attach to additional attributes.

+1


source


I personally prefer to add additional, unknown data as a separate table. This allows you to allow an infinite number of options (just have IDs, name, type (? - optional) and data columns), but it has the added benefit of allowing selective updates / deletes to be chosen.

If you do this as a single data sheet, every time you change any one piece of aux data, you will need to replace the entire data field.

+1


source


Depends on the data. If it's not easy to model in a relational framework, then what you suggest (it's just Serialized LOB ) is fine.

0


source


If you want to use data in a request, I would not recommend putting it in an xml field. Yes, there are ways to query data from xml fields, but I haven't found them efficient.

0


source


Yes, it's okay. Whether or not this is recommended depends largely on your specific situation. We are doing something like this using Oracle and the flexibility is great, but our custom web framework (Java servlet that builds web pages / applications based on XML-based modules) uses XML heavily and we have specific systems for storing data in a single XML column, so my opinion and experience are based on scenarios where it is convenient.

For example (and I know it might sound awful to people), if you are going to search based on data inside your XML, it might be a problem in the first place, as running XPaths and extracting data from XML on every line every time you are looking for a massive performance hit ... We have a materialized view-like system that uses triggers and a stored query. Every time a row in the basetable (containing an XML column) changes, the triggers are fired, your query is fired and fetches the data from the XML and inserts it into a relational table, which in turn has no idea about it, it can't change the relational data (so as it doesn't reflect back to XML).

For the purpose of something like a large number of web forms performing CRUD operations where the schema and therefore the data model can change frequently, this is fantastic. This means you can pull the XML fragment and instantly get the base model for your page and save it as easy as inserting the XML fragment back. For fast read-only access, we have instant access to the XML relational view.

0


source


I am assuming that you will also have to query and look for Employee attributes in XML.

This might be OK if you are using XML schema and creating the required XML indexes. You can also use the EAV relational model and are probably more performant. You should read the white paper Best Practices for Semantic Data Modeling for Performance and Scalability if you are going to implement the EAV model, a document written by the SQL Client Advisory Group addresses common errors and problems with poorly designed EAV models.

As a side note, storing just the ID and NAME for an attribute-rich object as an employee is like actually throwing the ball out of bounds. You should at least add the generic attributes that you expect most of your clients to use, and rely only on extensible models (XML or EAV) for additional attributes that you cannot see.

Update

Since we're at it, here are the SQL CAT white papers on XML best practices:

0


source


Yes, as long as the additional data is not "tied" to any other table and you do not need to search it often. If you have more than one user editing different pieces of additional data for the same employee, then think carefully about your design.

So, if your additional data is just "note" fields, and generally for every predefined client header, that's okay.

It doesn't make sense to put this extra data in the [EmployeeField] table, as it just makes the data access code more complex, preventing you from taking advantage of the database.

Rather, storing the type with every piece of data, I think you should have a metadata table that stores the type, name, and "display name" of every bit of additional data a given client wants to have. You may also need to save a form that the customer wants to use to enter additional data.

0


source







All Articles