Mysql sum is different from other columns containing multiple LEFT JOINs

I have 5 tables that I would like to join together. Tables: visitors, offers, contracts1, contracts2 and contracts3.

QUERY

SELECT 
        count(DISTINCT visitors.ID) as visitors, 
        sum(
        CASE
        WHEN offers.ACTIVE = 1 THEN 1
        ELSE 0
        END) as offers, 
        count(contracts1.ID) as contracts1, sum(contracts1.PRICE) as sum_contracts1, 
        count(contracts2.ID) contracts2, 
        sum(
        CASE
        WHEN contracts2.PAYMENT = 'YEARLY' THEN contracts2.PRICE
        WHEN contracts2.PAYMENT = 'TWICE' THEN contracts2.PRICE*2
        ELSE contracts2.PRICE*4
        END) as sum_contracts2,
        count(contracts3.ID) as contracts3, sum(contracts3.PRICE) as sum_contracts3
        FROM visitors 
        LEFT JOIN offersON offers.VISITOR_ID = visitors.ID AND (offers.IP > 100 OR offers.IP < 0)
        LEFT JOIN contracts1 ON 
        (offers.ID = contracts1.ID_OFFER)
        LEFT JOIN contracts2 ON 
        (offers.ID = contracts2.ID_OFFER)
        LEFT JOIN contracts3 ON 
        (offers.ID = contracts3.ID_OFFER)
        WHERE  visitors.TIME >= '2017-01-01 00:00:00' AND visitors.TIME <= '2017-05-25 23:59:59'

      

The problem is that contracts1, contracts2 and contracts3 do not have a common column to combine. So instead of 20 lines for contracts1, 30 for contracs2, and 50 for contracts3, I get the whole combination for all of them. Because they are aggregated based on visitors and offer tables. A simple GROUP BY at the end of the query usually solves the problem, but if I use the GROUP BY in the END on one of these tables (or all of them), it creates the MULTIPLE ROWS instead of the 1 I want. And also it will remove all other results for the part where I count visitors by ID and also suggest by ID ... I can use DISTINCT for count () parts of SELECT, but not one sum (), because PRICE contracts can be the same even if the ids are not (you know, for example, 2 chocolates are 2 strings with different ids,but with the same price for $ 10 each).

So my question is:

Is there any way to SUM only those PRICES of contracts1, contracts2 and contracts3 that have the ID DISTINCT and while getting rid of the addition of duplicates? And is this possible without creating a VIEW?

I also tried GROUP BY inside a LEFT JOIN, but again when I LEFT JOINED all 3 contract tables together, although I GROUPED them before I ended up with duplicates.

An example of the expected result:

In the time period that I have outlined above, I would expect: 80 visitors who have 35 offers and 5 contracts1 with the amount of 1000 euros, 12 contracts2 with the amount of 686 euros and 3 contracts3 with the amount of 12 euros. It is ONE ROW with 8 data columns.

Instead of the expected result, I got: 80 visitors, 35 offers, 180 contracts1 (the amount is also bad), 180 contracts2 (the amount is also bad), 180 contracts3 (the amount is also bad).

+3


source to share


2 answers


With CTE ( MariaDB 10.2.1 Supported ) I would write something like this:

WITH v AS (
    SELECT ID as VISITOR_ID
    FROM visitors 
    WHERE visitors.TIME >= '2017-01-01 00:00:00'
      AND visitors.TIME <= '2017-05-25 23:59:59'
), o AS (
    SELECT offers.ID as ID_OFFER
    FROM v
    JOIN offers USING(VISITOR_ID)
    WHERE offers.ACTIVE = 1
      AND (offers.IP > 100 OR offers.IP < 0)
), c1 AS (
    SELECT count(*) as contracts1, sum(contracts1.PRICE) as sum_contracts1
    FROM o JOIN contracts1 USING(ID_OFFER)
), c2 AS (
    SELECT
        count(*) contracts2, 
        sum(CASE contracts2.PAYMENT
            WHEN 'YEARLY' THEN contracts2.PRICE
            WHEN 'TWICE'  THEN contracts2.PRICE*2
            ELSE contracts2.PRICE*4
        END) as sum_contracts2
    FROM o JOIN contracts2 USING(ID_OFFER)
), c3 AS (
    SELECT count(*) as contracts3, sum(contracts3.PRICE) as sum_contracts3
    FROM o JOIN contracts3 USING(ID_OFFER)
)
    SELECT c1.*, c2.*, c3.*,
        (SELECT count(*) FROM v) as visitors,
        (SELECT count(*) FROM o) as offers,
    FROM c1, c2, c3;

      



Without CTE, you can rewrite it to use temporary tables:

CREATE TEMPORARY TABLE v AS
    SELECT ID as VISITOR_ID
    FROM visitors 
    WHERE visitors.TIME >= '2017-01-01 00:00:00'
      AND visitors.TIME <= '2017-05-25 23:59:59';

CREATE TEMPORARY TABLE o AS
    SELECT offers.ID as ID_OFFER
    FROM v
    JOIN offers USING(VISITOR_ID)
    WHERE offers.ACTIVE = 1
      AND (offers.IP > 100 OR offers.IP < 0);

CREATE TEMPORARY TABLE c1 AS
    SELECT count(*) as contracts1, sum(contracts1.PRICE) as sum_contracts1
    FROM o JOIN contracts1 USING(ID_OFFER);

CREATE TEMPORARY TABLE c2 AS
    SELECT
        count(*) contracts2, 
        sum(CASE contracts2.PAYMENT
            WHEN 'YEARLY' THEN contracts2.PRICE
            WHEN 'TWICE'  THEN contracts2.PRICE*2
            ELSE contracts2.PRICE*4
        END) as sum_contracts2
    FROM o JOIN contracts2 USING(ID_OFFER);

CREATE TEMPORARY TABLE c3 AS
    SELECT count(*) as contracts3, sum(contracts3.PRICE) as sum_contracts3
    FROM o JOIN contracts3 USING(ID_OFFER);

SELECT c1.*, c2.*, c3.*,
    (SELECT count(*) FROM v) as visitors,
    (SELECT count(*) FROM o) as offers,
FROM c1, c2, c3;

      

+2


source


Just a proof-of-concept in which I don't account for time and activity limits and payment type, but couldn't it be something like that?

SELECT
   VISITOR_ID,
   SUM(CASE WHEN TYPE="contract1" THEN 1 else 0 END) as c1_count,
   SUM(CASE WHEN TYPE="contract1" THEN PRICE else 0 END) as c1_total_price,
   SUM(CASE WHEN TYPE="contract2" THEN 1 else 0 END) as c2_count,
   SUM(CASE WHEN TYPE="contract2" THEN PRICE else 0 END) as c2_total_price,
   SUM(CASE WHEN TYPE="contract3" THEN 1 else 0 END) as c3_count,
   SUM(CASE WHEN TYPE="contract3" THEN PRICE else 0 END) as c3_total_price 
FROM (
    (SELECT "contract1" as TYPE, ID, PRICE, ID_OFFER, PAYMENT FROM contracts1) 
    UNION
    (SELECT "contract2" as TYPE, ID, PRICE, ID_OFFER, PAYMENT FROM contracts2)
    UNION
    (SELECT "contract3" as TYPE, ID, PRICE, ID_OFFER, PAYMENT FROM contracts3)
 ) as all_contracts 
 JOIN offers on offers.id = all_contracts.ID_OFFER
 JOIN visitors on visitors.ID = offers.VISITOR_ID
 GROUP BY visitors.ID

      



The idea is that first you combine different contracts into one result, where you store them in a column called "TYPE" (which is the purpose of the UNION queries), and once you have such a nice table where each contract exactly once you can get the desired result quite simply. I just told you how you get the amount and calculate for each type of contract. Of course, the final request will be a little more complicated, but the basic idea should be the same.

But despite your expression that you don't want to use temp views, I would ask you to give it a try - I have a feeling that including these offer and attendee-related "all_contracts" in the temp would improve performance if it your concern is not making the query too ugly, mainly in the case where you would like to see statistics for only one visitor or filter them further (by time, activity, etc.), since unnecessary rows won 'materialize. But this is just an impression since I haven't tried the query on a larger dataset - you can play with it.

0


source







All Articles