Why does a query with IN (subquery) take longer than a query with IN (discrete list)

Question

Why does a query with IN (subquery) take longer than a query with IN (discrete list)

It always bothered me why this request

SELECT 
  * 
FROM
  `TABLE` 
WHERE `value` IN 
  (SELECT 
    val 
  FROM
    OTHER_TABLE 
  WHERE `date` < '2014-01-01')

running orders is slower than running both of these queries sequentially

SELECT 
  `val` 
FROM
  OTHER_TABLE 
WHERE `date` < '2014-01-01' 

Result:
+----+
| val |
+-----+
| v1  |
| v2  |
| v3  |
| v7  |
| v12 |
+-----+

and this request:

SELECT 
  * 
FROM
  `TABLE` 
WHERE `value` IN ('v1', 'v2', 'v3', 'v7', 'v12')

+3

mysql subquery correlated-subquery

chiliNUT Apr 25. '15 at 1:00

source to share

1 answer

chiliNUT · Answer 1 · 2015-04-25T01:00:39+0000

From the docs: (emphasis mine)

Optimizing a subquery for is IN

not as efficient as it is for an operator =

or for an operator IN(value_list)

.

A typical case for poor IN

subquery performance is when the subquery returns a small number of rows, but the outer query returns a large number of rows to compare against the result of the subquery.

The problem is that for a statement that uses a subquery IN

, the optimizer rewrites it as a correlated subquery. Consider the following operator, which uses an uncorrelated subquery:

SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);

The optimizer rewrites the operator into a correlated subquery:

SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);

If the inner and outer queries return rows M and N, respectively, then the execution time becomes of order O(M×N)

rather than O(M+N)

as it would be for an uncorrelated subquery.

The implication is that a subquery IN

can be much slower than a query written using an operator IN(value_list)

that displays the same values that the subquery will return.

http://dev.mysql.com/doc/refman/5.7/en/subquery-restrictions.html

Hope this helps anyone who would be curious

Why does a query with IN (subquery) take longer than a query with IN (discrete list)

More articles: