MySQL creates temporary table then joins faster than left join

I have LEFT JOIN

, which is very expensive:

    select X.c1, COUNT(Y.c3) from X LEFT JOIN Y on X.c1=Y.c2 group by X.c1; 

      

After a few minutes (20+), it still does not end. But I want all the lines in X. At some point I really need it LEFT JOIN

.

It looks like I can hack this way to return the result set I'm looking for using a temporary table in less than two minutes. First, I trim table Y so that it only contains the rows in the join.

CREATE TEMPORARY TABLE IF NOT EXISTS table2 AS 
(select X.c1 as t, COUNT(Y.c2) as c from X 
INNER JOIN Y where X.c1=Y.c2 group by X.c1);

select X.c1, table2.c fromLEFT JOIN table2 on X.c1 = table2.t; 

      

It ends in two minutes.

My questions:

1) Are they equivalent?

2) Why is the second one so faster (why doesn't MySQL do this kind of optimization), that is, I need to do these mysql types?

EDIT: More info: C1, C2 BIGINTS

. C1 is unique, but there can be many C2s that all point to the same C1. As far as I know, I have not indexed tables. X.C1 is the _id column to which Y.c2 belongs.

+3


source to share


2 answers


Try indexing X.c1 and Y.c2 and run your original query.



It's hard to tell why your first query is slower without indexes without comparing the query plans with both queries (you can get the query plan by executing your queries with the help explain

at the beginning), but I suspect this because the second table contains many rows that do not have the corresponding row in the first table.

+3


source


If x.c1

unique, I would suggest writing the query as:

select X.c1,
       (select COUNT(Y.c3)
        from Y
        where  X.c1 = Y.c2
       )
from X;

      



For this query, you need an index on Y(c2, c3)

.

The reason it left join

may take longer is because many lines do not match. In this case, it is group by

aggregated with many rows than is really necessary. And no, MySQL doesn't try this type of optimization.

0


source







All Articles