Which approach is better from a database performance standpoint — one table group by operation or multiple table serializable transactions?

I am developing a redundancy system based on SQL database (no limitation to specific DB).

Scheme:
enter image description here

Workflow:
enter image description here

I now use a special view "actual_reservations" based on the cusom-group with filtering logic to identify the actually hosted reservations, such as:

CREATE VIEW `reserv-io`.`actual_reserved_resources` AS
  SELECT
    `r`.`id`,
    `a`.`resource_id`,
    `a`.`type_id`,
    `a`.`status_id`,
    `a`.`reservation_start`,
    `a`.`reservation_end`
  FROM `reserv-io`.`actions` `a`
    JOIN (SELECT
            `r`.`id`      `id`,
            MAX(`a`.`id`) `action_id`
          FROM `reserv-io`.`reservations` `r`
            JOIN `reserv-io`.`actions` `a` ON `r`.`id` = `a`.`reservation_id`
          WHERE (`r`.`id` NOT IN
                 (SELECT `r`.`id` `id`
                  FROM `reserv-io`.`reservations` `r`
                    JOIN `reserv-io`.`actions` `a` ON `r`.`id` = `a`.`reservation_id`
                  WHERE (`a`.`status_id` IN
                         (SELECT `id`
                          FROM `reserv-io`.`reservation_statuses`
                          WHERE `name` = 'Canceled')))
                 AND `a`.`status_id` IN
                     (SELECT `id`
                      FROM `reserv-io`.`reservation_statuses`
                      WHERE `name` = 'Approved' OR `name` = 'Accepted'))
          GROUP BY `r`.`id`) `b`
      ON `a`.`id` = `b`.`action_id`
    JOIN `reserv-io`.`reservations` `r` ON `r`.`id` = `a`.`reservation_id`;
      

Run codeHide result


And to determine if there is any overlap with other caveats with a procedure like this:

CREATE PROCEDURE HAS_OVERLAPPING_RESERVATION_WITH_TYPE_SELF_CHECK(
  IN  RESERVATION      BIGINT,
  IN  RESOURCE         INT,
  IN  RESERVATION_TYPE INT,
  IN  STARTS_AT        DATETIME,
  IN  ENDS_AT          DATETIME,
  OUT RESULT           BIT)
  BEGIN
    SELECT CASE WHEN EXISTS(
        SELECT *
        FROM actual_reserved_resources r
        WHERE r.resource_id = RESOURCE
              AND r.type_id = RESERVATION_TYPE
              AND r.reservation_start < ENDS_AT
              AND r.reservation_end > STARTS_AT
              AND r.id <> RESERVATION)
      THEN TRUE
           ELSE FALSE END
    INTO RESULT;
  END$$
      

Run codeHide result


I already have good indexes on the "actions" table, but I'm not sure if my single-table approach for all backup actions really scales. I've heard of an approach where I store all pending and actually approved reservations in separate tables and synchronize them manually in a transaction.

My question is which approach from your POV is better in terms of database performance, scalability, and good data design?

+3


source to share


2 answers


I would suggest that:



  • Don't take anyone answering here for granted; instead, comparing real data to measure actual performance.

  • Whatever the answer, it won't matter as much as you fear. This, of course, will not differentiate between scalable and non-scalable systems.

  • If it turns out that there actually is some merit in some sort of pending and approving record segregation, you can achieve this with line splitting (check it out) so you can save everything to one logical table and avoid messing up your schema in performance name.

+4


source


You do not specify the size of your dataset or the size of the system on which your queries run. Until you get large datasets, it probably doesn't matter much. My personal experience with large datasets is that joins of more than two tables can be very slow when the system starts swapping out to disk rather than keeping things in memory. On Oracle many years ago, I had a special query that had to touch six tables. One request took ~ six hours. Since the separate queries were against two tables at a time, this took 15 minutes. I had another query in MySQL that touched a number of tables. The breakdown into smaller requests reduced the total runtime from seven hours to seven minutes.



0


source







All Articles