What is the storage strategy for the data model?

I am trying to find a good design for storing a data model. The language is python, but I think it's pretty agnostic.

I currently present three possible strategies:

Object database

A datamodel is a network of objects. When I create them, I define them as a descendant from the persistence object. Example:

class Tyres(PersistentObject):
    def __init__(self,brand):
        self._brand = brand

class Car(PersistentObject):
    def __init__(self,model):
        self._model = model
        self._tyres = None
    def addTyres(self,tyres):
        self._tyres = tyres
    def model(self):
        return model

      

The client code is unaware of persistence, it manipulates the object as it did in memory, and the persistence object takes care of everything without knowing the client code. The search can be done using a keyword search on the database object. This is the approach the Zope Object Database (among many others) uses. The advantages are lazy search, and changes only take effect on objects that have changed, without getting those that are intact.

Shelf objects

The above data model is represented in memory, but then the database is used to push the data as monolithic objects. eg:

 car = Car("BMW")
 tyres = Tyres("Bridgestone")
 car.setTyres(tyres)
 db.store(car)

      

This is what makes the brine based solution. This is, in a sense, similar to the previous solution, with the only difference that you store the object as a single package and return it again as a single package.

Facade

A single database class with convenience methods. Client code never processes objects, only identifiers. Example

class Database:
     def __init__(self):
         # setup connection

     def createCar(self, model):
         # creates the car, returns a numeric key car_id

     def createTyresForCar(self, car_id, brand):
         # creates the tyres for car_id, returns a numeric id tyres_id

     def getCarModel(self, car_id):
         # returns the car model from the car identifier
     def getTyresBrand(self, car_id, tyre_id):
         # returns the tyre brand for tyres_id in car_id.
         # if tyres_id is not in car_id, raises an error.
         # apparently redundant but it not guaranteed that
         # tyres_id identifies uniquely the tyres in the database.

      

This decision is rather controversial. A database class can have many responsibilities, but I have a feeling that this is the philosophy used in SOAP: you cannot manipulate an object directly, you are making requests for properties of an object to a remote server. In the absence of SQL is likely to be an interface to a relational database: db.createTable()

, db.insert()

, db.select()

. SQL simplifies this to get a very simple db interface, db.query(sql_string)

at the cost of parsing and executing a language (SQL). You can still work with subpartitions of the data model you are interested in without touching others.

I would like to ask your opinion on three projects, and in particular on the third. When is it a good design, if ever?

Reversed logic

This is what I saw in the MediaWiki code. Instead of something like

 db.store(obj)

      

they have

 obj.storeOn(db)

      

Edit . The datamodel example I am showing is a bit simple. My real goal is to create a graph based datamodel (in case anyone wants to contribute to a project I would do). I'm worried about the third solution, which strongly encapsulates the written database (as opposed to inline memory) and masks the backend, but it risks exploding as there is only one central class with all the methods exposed. I have to be honest, I don't like the third case, but I thought of this as a possible solution, so I wanted to put it on the platter of the question. It might be good there.

Edit 2 : Added entry with inverted logic

+2


source to share


2 answers


The first design is most compatible with the domain design . Having an implementation persistence entirely in person to an object means that you can use an object without considering its relational representation. It can only be useful for an object to expose methods that are related to its domain specific behavior, and not low-level CRUD operations. The high level methods are the only API contract that you want to offer to the consumers of this object (i.e. you don't want anyone to be able to delete the car). You can implement complex data relationships and only encode them in one place.

The second construct can be used with a visitor template . The vehicle object knows which parts of it need to be stored, but it does not have a database connection. This way you are passing a car object to a database connection instance. Internally, the db knows how to call the object it gave. Presumably the car implements some db compatible interface.



The third design is useful for implementing the adapter pattern . Each database API is different and each SQL flavor is slightly different. If you have a common API for simple database operations, you can encapsulate those differences and replace with another implementation that knows how to talk to the appropriate brand of database.

+1


source


It's hard to tell because your example is clearly contrived.

The decision should be made based on how often your data model changes. IMHO machines do not often assemble new parts; so I would go with a static model in the database of all the items you want to model and then a table linking all those items together, but that might not be right for what you are actually doing.



I would suggest that you talk to us about the actual data you need to model.

0


source







All Articles