MySQL subselect performance question?

I need advice on subselect performance in MySQL. For the reason that I cannot change, I cannot use JOIN to create a quesry filter, I can only add another AND clause to WHERE.

What is the shape:

select tasks.*
from tasks
where 
  some criteria
  and task.project_id not in (select id from project where project.is_template = 1);

      

compared with:

select tasks.*
from tasks, project
where
  some criteria
  and task.project_id = project.id and project.is_template <> 1;

      

Note that there are relatively few projects whete is_template = 1, and there may be a large number of projects where is_template <> 1.

Is there another way to achieve the same result without subqueries if I can't change and filter anything?

0


source to share


4 answers


I believe the second one is more efficient as it only requires one choice, but of course you have to EXPLAIN each query and check the results.



EXPLAIN select tasks.*
from tasks
where 
  some criteria
  and task.project_id not in (select id from project where project.is_template = 1);

EXPLAIN select tasks.*
from tasks, project
where
  some criteria
  and task.project_id = project.id and project.is_template <> 1;

      

+5


source


How much of a difference between the two can greatly depend on what the "some criteria" are and what opportunities to use the indexes it provides. But keep in mind that they are not equivalent in terms of results if there are tasks that do not have projects. Second equivalent:



select tasks.*
from tasks
where 
  some criteria
  and task.project_id in (select id from project where project.is_template <> 1);

      

+1


source


I think the first one can scale better:

When you do a join, mysql internally creates a kind of temporary table made up of two tables joined according to the specified join conditions. You are not giving a join condition, so it will create a temporary table with all the tasks listed against all projects. I am fairly certain (but check with the explain tool) that it does this before applying any suggestions.

Result: if there are 10, it will have 10 * 10 rows = 100. You can see how this gets larger as the number grows. Then it applies space for this temporary table.

In contrast, a subquery only selects matching rows from each table.

But if scaling isn't a concern, I don't think it really matters.

0


source


Avoid subqueries like the plague in MySQL versions <6.0 and I doubt you are using 6.0 considering it is still in alpha development. AFAIK, the MySQL optimizer doesn't process subqueries at all. Some important work overhauled the optimizer for 6.0, and subqueries now perform much better, but these changes didn't make it into the 5.0 or 5.1 series.

0


source







All Articles