Pick a random record from the table, why is it slower than a second at first?

I want to select a random record from a large table. After searching, I finally found two solutions:

and:

select id  from `table` where id = (floor(1 + rand() * 2880000));

      

b:

select id  from `table` where id >= (floor(1 + rand() * 2880000)) limit 1;

      

But the first (a) solution is much slower than the second (b), about 40 times slower.

After doing it many times, I find a stranger problem. The first solution might return two records.

select id  from `table` where id = (floor(1 + rand() *  2880000));
+---------+
| id      |
+---------+
| 2484024 |
| 1425029 |
+---------+
2 rows in set (1.06 sec)

      

My question is:

  • Why is the first solution slower than the second?
  • Why did the first solution return two records?

My MySQL version:

mysql> show variables like "%version%";
+-------------------------+-------------------------+
| Variable_name           | Value                   |
+-------------------------+-------------------------+
| innodb_version          | 5.5.43                  |
| protocol_version        | 10                      |
| slave_type_conversions  |                         |
| version                 | 5.5.43-0ubuntu0.12.04.1 |
| version_comment         | (Ubuntu)                |
| version_compile_machine | x86_64                  |
| version_compile_os      | debian-linux-gnu        |
+-------------------------+-------------------------+
7 rows in set (0.04 sec)

      

Thanks for any help.

+3


source to share


2 answers


Answers to both questions:

  • The first solution is slower than the second, because the first solution calculates a new random value for each record, and the second only calculates the records needed to find one match. Also note that the condition for the second solution is much less stringent.
  • In the first solution, you can have multiple return values โ€‹โ€‹because a new random value is calculated for each record and you don't have a limit operator. By the same logic, you can also get 0 results.


Check out this answer for a better solution.

+1


source




SELECT 
    a.id
FROM
    tableA a
        INNER JOIN
    (SELECT 
        (ROUND((RAND() * (MAX(id) - MIN(id))) + MIN(id)) - 1) r
    FROM
        tableA) x
WHERE
    a.id > x.r
LIMIT 1;



      



0


source







All Articles