ORBER BY makes the query slow despite a small result set
I have the following tables (unnecessary things removed):
create table Payment (
id int not null auto_increment,
status int not null,
primary key(id)
);
create table Booking (
id int not null auto_increment,
paymentId int not null,
nrOfPassengers int not null,
primary key(id),
key paymentFK (paymentId),
constraint paymentFK foreign key (paymentId) references Payment(id)
);
Booking
contains ~ 456k lines and Payment
contains ~ 331k lines. The following query takes 0.06 s and returns 97 rows:
select * from Booking b
join Payment p on b.paymentId = p.id
where p.status = 3
If I add a suggestion order by
, the request instead takes 4.4s, almost 100 times slower:
select * from Booking b
join Payment p on b.paymentId = p.id
where p.status = 3
order by b.nrOfPassengers
EXPLAIN for the first request:
id, select_type, table, type, possible_keys, key, key_len, ref, rows, Extra
1, SIMPLE, p, ALL, PRIMARY, NULL, NULL, NULL, 331299, Using where
1, SIMPLE, b, ref, paymentFK, paymentFK, 9, p.id, 1, Using where
and for the second:
id, select_type, table, type, possible_keys, key, key_len, ref, rows, Extra
1, SIMPLE, p, ALL, PRIMARY, NULL, NULL, NULL, 331299, Using where; Using temporary; Using filesort
1, SIMPLE, b, ref, paymentFK, paymentFK, 9, p.id, 1, Using where
I am using MySQL 5.1.34.
The clause where
used in the query filters out the vast majority of rows from Payment
. My impression is that MySQL is sorting the result set before filtering it with a (highly selective) clause where
. Am I right about that? If so, why is he doing this? I tried to parse both tables, but no change in query plans.
source to share
Make sure you have appropriate indexes on your tables first. Assuming you do this, and it's still slower than expected, you can submit your results to a subquery without ordering them, and then add the ORDER BY clause back:
SELECT *
FROM (
select * from Booking b
join Payment p on b.paymentId = p.id
where p.status = 3
)
ORDER BY nrOfPassengers
I'm not sure how much (or if) this helps, since when I go through the execution plan it adds a line, but it looks like it might be faster.
Good luck.
source to share
I have a suspicion that the problem is that among the unnecessary things that you removed, there is a column TEXT
or BLOB
that makes MySQL go to ondisk to store the intermediate result from the temporary table.
In any case, as seen from the execution plan: For each row from the table, Payment
fetch it from disk, check the condition, if it is true for each corresponding row in Booking
, put the result in a temporary table. Sort the entire table with all data on nrOfPassengers
and output. In case there are fields TEXT
or BLOB
, the staging table is stored and sorted on disk, since MySQL cannot predict the size of the table.
What you can do (as usual) is minimize disk operations. As @ajreal suggested, add an index to the column status
. If it's so selective you won't need any other indices, but if you increase your paymentFK
to (paymentId, nrOfPassengers)
it is even better. Now rewrite the request like this:
SELECT p.*, b.*
FROM (
select p.id as paymentId, b.id as bookingId
from Booking b
join Payment p on b.paymentId = p.id
where p.status = 3
order by b.nrOfPassengers
) as ids
JOIN Payment p ON ids.paymentId = p.id
JOIN Booking b ON ids.bookingId = b.id;
The data will be displayed in the order of the subquery.
source to share