ORBER BY makes the query slow despite a small result set

I have the following tables (unnecessary things removed):

create table Payment (
    id int not null auto_increment,
    status int not null,
    primary key(id)
);
create table Booking (
    id int not null auto_increment,
    paymentId int not null,
    nrOfPassengers int not null,
    primary key(id),
    key paymentFK (paymentId),
    constraint paymentFK foreign key (paymentId) references Payment(id)
);

      

Booking

contains ~ 456k lines and Payment

contains ~ 331k lines. The following query takes 0.06 s and returns 97 rows:

select * from Booking b
join Payment p on b.paymentId = p.id
where p.status = 3

      

If I add a suggestion order by

, the request instead takes 4.4s, almost 100 times slower:

select * from Booking b
join Payment p on b.paymentId = p.id
where p.status = 3
order by b.nrOfPassengers

      

EXPLAIN for the first request:

id, select_type, table, type, possible_keys, key,       key_len, ref,  rows,   Extra
1,  SIMPLE,      p,     ALL,  PRIMARY,       NULL,      NULL,    NULL, 331299, Using where
1,  SIMPLE,      b,     ref,  paymentFK,     paymentFK, 9,       p.id, 1,      Using where

      

and for the second:

id, select_type, table, type, possible_keys, key,       key_len, ref,  rows,   Extra
1,  SIMPLE,      p,     ALL,  PRIMARY,       NULL,      NULL,    NULL, 331299, Using where; Using temporary; Using filesort
1,  SIMPLE,      b,     ref,  paymentFK,     paymentFK, 9,       p.id, 1,      Using where

      

I am using MySQL 5.1.34.

The clause where

used in the query filters out the vast majority of rows from Payment

. My impression is that MySQL is sorting the result set before filtering it with a (highly selective) clause where

. Am I right about that? If so, why is he doing this? I tried to parse both tables, but no change in query plans.

+3


source to share


2 answers


Make sure you have appropriate indexes on your tables first. Assuming you do this, and it's still slower than expected, you can submit your results to a subquery without ordering them, and then add the ORDER BY clause back:

SELECT * 
FROM (
   select * from Booking b
   join Payment p on b.paymentId = p.id
   where p.status = 3
)
ORDER BY nrOfPassengers

      



I'm not sure how much (or if) this helps, since when I go through the execution plan it adds a line, but it looks like it might be faster.

Good luck.

+1


source


I have a suspicion that the problem is that among the unnecessary things that you removed, there is a column TEXT

or BLOB

that makes MySQL go to ondisk to store the intermediate result from the temporary table.

In any case, as seen from the execution plan: For each row from the table, Payment

fetch it from disk, check the condition, if it is true for each corresponding row in Booking

, put the result in a temporary table. Sort the entire table with all data on nrOfPassengers

and output. In case there are fields TEXT

or BLOB

, the staging table is stored and sorted on disk, since MySQL cannot predict the size of the table.

What you can do (as usual) is minimize disk operations. As @ajreal suggested, add an index to the column status

. If it's so selective you won't need any other indices, but if you increase your paymentFK

to (paymentId, nrOfPassengers)

it is even better. Now rewrite the request like this:



SELECT p.*, b.*
FROM (
  select p.id as paymentId, b.id as bookingId
  from Booking b
  join Payment p on b.paymentId = p.id
  where p.status = 3
  order by b.nrOfPassengers
) as ids
JOIN Payment p ON ids.paymentId = p.id
JOIN Booking b ON ids.bookingId = b.id;

      

The data will be displayed in the order of the subquery.

+1


source







All Articles