Computing Cascading Pricing Rules in SQL
I am creating a tool that allows people to place items in categories. Before an item is shipped, a price calculation must be done to determine the price to be charged to the user who sent the item.
I have developed the concept of using a table called price_rule
to define rules and it would look something like this:
+------------ +
| Field |
--------------+
| id |
| user |
| category |
| price |
+-------------+
Typically, this database will always have a standby row. This string is used to determine the price in case no other rules are appropriate for a particular context. This line will like:
+-----------+---------+-------------+-------+
| id | user | category | price |
+-----------+---------+-------------+-------+
| 1 | NULL | NULL | 10.00 |
+-----------+---------+-------------+-------+
With this line in place and without the other lines, the post will always cost $ 10.
Now suppose an extra line is added. The table now looks like this:
+-----------+---------+-------------+-------+
| id | user | category | price |
+-----------+---------+-------------+-------+
| 1 | NULL | NULL | 10.00 |
+-----------+---------+-------------+-------+
| 2 | Bob | NULL | 8.00 |
+-----------+---------+-------------+-------+
With this rule added, Bob will pay $ 8.00 for each post in all categories, and other users will pay $ 10.00 for all categories.
If we add a few lines that include specific users and specific categories:
+-----------+---------+-------------+-------+
| id | user | category | price |
+-----------+---------+-------------+-------+
| 1 | NULL | NULL | 10.00 |
+-----------+---------+-------------+-------+
| 2 | Bob | NULL | 8.00 |
+-----------+---------+-------------+-------+
| 3 | Bob | Bicycles | 9.50 |
+-----------+---------+-------------+-------+
| 4 | Meghan | Bicycles | 5.00 |
+-----------+---------+-------------+-------+
When Bob starts his job in the bike category, his price will be $ 9.50. Any other category and Bob will pay $ 8.
Meghan will now pay $ 5.00 for placement in the bike category and $ 10.00 for each other.
Every other post in any category (including bicycles) pays a default cost of $ 10.00.
In the real world, this table can have several hundred rows, which will allow precise control of the publication price.
In case you're wondering, this concept is based on business motivations, since the costs of creating positions in this system are not always deterministic. Instead, it is based on the business relationship with the user creating the message.
The problem manifests itself when trying to design a query that returns the single most relevant pricing rule that applies to a message. When I query this table, I will have access to the following information: user and category. I have tried a combination of queries and read about a number of SQL concepts like IFNULL
and COALESCE
but I was unable to nail down the correct query.
Another problem is that in our real application there is an extra column in the table price_rule
to discard prices, but I left this detail to simplify the example used in this question. I find that the same solution is most likely applicable if 2 columns or 3 columns are used for the calculation.
Please note that there are restrictions in the application code that prohibit the addition of duplicate rules.
We also use Doctrine ORM and Doctrine DBAL, so if your query works out of the box with Query Builder or DQL, your answer will be more valuable. Standard SQL solutions are also acceptable, working in both PostgreSQL and MySQL.
While I would like to avoid this as much as possible, a valid solution could also involve fetching each row from the table price_rule
and determining the applicable rule using application code. If your solution is based on this concept, include the appropriate pseudocode.
source to share
create or replace function price_rule(
_user varchar(50), _category varchar(50)
) returns setof price_rule as $$
select id, "user", category, price
from (
select *, 0 as priority
from price_rule
where category = _category and "user" = _user
union
select *, 1 as priority
from price_rule
where "user" = _user and category is null
union
select *, 2 as priority
from price_rule
where "user" is null and category is null
) s
order by priority
limit 1
;
$$ language sql;
I have turned the above request into a function to make it easier to test. But just expand it if you like.
In MySQL, I don't remember how to make the function raw:
select id, `user`, category, price
from (
select *, 0 as priority
from price_rule
where category = 'Bicycles' and `user` = 'Bob'
union
select *, 1 as priority
from price_rule
where `user` = 'Bob' and category is null
union
select *, 2 as priority
from price_rule
where `user` is null and category is null
) s
order by priority
limit 1;
source to share
A simple way would be to hammer each row with a fit and select the row with the highest score to fit Bob / Bikes something like:
SELECT price,
CASE WHEN "user" = 'Bob' THEN 2
WHEN "user" IS NULL THEN 0
ELSE -10 END +
CASE WHEN "category" = 'Bicycles' THEN 1
WHEN "category" IS NULL THEN 0
ELSE -10 END score
FROM field
ORDER BY score DESC LIMIT 1;
This gives 2 points for name matching and 1 point for category matching. Any mismatch gives -10, which allows you to win by default if nothing fits well.
If you have a large number of rows, you need to add a WHERE clause that only finds rows that match (i.e. name / category match or null), and just use the ordering by the filtered rows.
source to share