Modeling relational objects for auditing issue states

This is what made me doubt for a while, so I thought it would be a good idea to post this here to find some insight into the problem / doubt in relational database modeling.

I have the following problem:

I have "questions" that need to be in a specific "state" and all state changes need to be checked.

I found two solutions for this, but I can't see the difference between them, if any ... what do you think.

Here's an image with both diagrams.

EDIT:

Option A: the "questions" table should not contain state_id, and Question_State should not contain the "id" field. I'm sorry for the mistakes.

EDIT2 :

Thanks for all the real life examples and insights, but this was an academic problem, not related to the real world :).

Diagrams

0


source to share


6 answers


I think the gist of the question is whether the question state should be based on an intermediate table between Questions and State that has a time component (A), or if the table will be more static, but with a history log table on side (B).

(Note: If you want to do a clean version (A) then Boofus is right, you probably wouldn't put the state_id in the Questions table either, as that's redundant, but it would definitely be awkward because it would make simple queries to get questions in a specific state are much more complicated. So you have a hybrid version here.)



In general, if the requirement to store historical state information is indeed valid for auditing purposes, that is, unless it will be regularly requested by the application itself, you are probably better off going with option B, because it is a little easier (in fact, there is only one the "Questions" table, with a reference table for states and a "log" table for previous states). I think this shows that your intentions are slightly better.

However, if the semantics of the application is more complex (for example, if you have queries like "show all questions that were in state X in the last 24 hours ...") then an approach like (A) can do more meaning. This essentially makes the state of the matter into a time-dependent fact. If you do, just keep in mind that it complicates things - all your queries are more complex and you have to keep track of time, or you need to worry about keeping the state_id in questions in sync with the most recent state in the Questions table. If you go that route, maybe call it "current_state" or something on "Questions" so it will clear it out as derived information.

+5


source


You might want to scour the web about temporary databases. Basically, keeping the change history of any variable causes the same problems, regardless of whether the variable captures the state of the question or the user's weight or whatever.

Secondly, I think your question is related to database design, not conceptual data modeling. If I understand your drift correctly, you ask which table design is better.

Third, I like option B better, but it really depends on what you are going to do with the data.



The reason I asked a question about database design and conceptual modeling is because I have long adopted the practice of using "entities and relationships" to model conceptual data related to data analysis. I use the terms "tables, columns, and rows" when discussing logical database design. Of course, the analysis of large volumes of work and their analysis are very valuable in large projects. And this is not as easy to do as it seems.

You really need to add an arrow between the history table and the state table in the diagram for option B. The way the diagram is displayed is almost like the Historical Table table - it is a non-overlapping table. Not a problem in this simple example, but if you follow the same practice when you scale to databases with dozens of tables, you end up confusing everyone looking at the chart.

+1


source


Once you draw all the relationships, they are the same.

I don't understand why you have a state_id in the question table - since you have a history table, having state in the question table is redundant and might leave you with data out of sync.

It seems to me that if you want the current state of the question, you do

SELECT State_ID FROM Historical WHERE Question_id =? ORDER BY Date DESC LIMIT 1

(or whatever method your SQL flavor uses to limit to just one row)

0


source


Assuming you have good levels of abstraction between the database and your OO, you might consider taking the State table out of the database and making it an enumeration in the class. This is not necessarily something to keep.

Then enter a status column in the Questions table and an audit table.

0


source


You say verified, which implies that you just want to keep historical information for reporting purposes. In this case, I would suggest that diagram B is clearer, although you should probably note the connection between many questions between history, history and history.

For practical reasons, if the circumstances are as above, I have to encapsulate the Historical Insert functionality in an insert / update trigger on Questions, and if the volume of the Questions table and / or the number of state changes will be significant, I would considered the possibility of placing the Historical table in another database. This makes it easier to manage the database later. I'm usually wary of triggers because overzealousness can lead to hard-to-reach databases (since it's not immediately obvious what's going on), but this is a clear case when they are appropriate and the best choice for using application logic.

By the way, both of your two diagrams assume that the question can ever only be entered into each state (from your PC) - you have to think if this is correct, as in most real-world applications, mistakes will be made and the states will be canceled.

0


source


I didn't understand how #Boofus interested in having a state_id field in the Questions table.

I've worked a lot with these "government" concepts in our own application. In the most difficult situations, when we have to keep track of the complete history of the state and situations where an object can have several states, we use the following model:

alt text

For multi-state situations, the idea is to check if the end_date value is zero (another idea would be to have the boolean field isActiveState in the table). Don't underestimate the interest in this multi-state configuration. Example:

The question may be

  • closed and resolved

or

  • closed and not resolved.

This can correspond to two different states:

  • Closed and allowed state

or

  • Closed and not allowed state

But I think the best solution would be to have like

  • Open / closed state

and

  • Allowed / Not Allowed State

And so that the question has several states

0


source







All Articles