MySQL: what's the most efficient way to select multiple random rows

I have a fairly large dataset and a query that requires two connections, so the efficiency of the query is very important to me. I need to get 3 random rows from a database that satisfy a condition based on the result of the join. The most obvious solution is listed as ineffective here because

[these solutions] need to scan the entire table sequentially (because you need to calculate the random value associated with each row to determine the smallest one), which can be quite slow for medium sized tables.

However, the method suggested by the author there ( SELECT * FROM table WHERE num_value >= RAND() * (SELECT MAX(num_value) FROM table) LIMIT 1

where num_value is ID) does not work for me because some ids may be missing (since some lines may be deleted by users).

So what would be the most efficient way to extract 3 random rows in my situation?

EDIT: The solution doesn't have to be pure SQL. I am using PHP as well.

+3


source to share


2 answers


Adding your RAND () call to the ORDER BY clause should allow you to ignore the identifier. Try the following:

SELECT * FROM table WHERE ... ORDER BY RAND() LIMIT 3;

      

After the performance issues have been pointed out, your best bet might be something along these lines (using PHP):



$result = PDO:query('SELECT MAX(id) FROM table');
$max    = $result->fetchColumn();
$ids    = array();
$rows   = 5;

for ($i = 0; $i < $rows; $i++) {
    $ids[] = rand(1, $max);
}

$ids     = implode(', ', $ids);
$query   = PDO::prepare('SELECT * FROM table WHERE id IN (:ids)');
$results = $query->execute(array('ids' => $ids));

      

At this stage, you will be able to select the first 3 results. The only problem with this approach is the rows deleted and you may need to either cast in $ rows var or add some logic to execute another query if you don't get at least 3 results.

+2


source


Since you don't want many results, there are some interesting options using LIMIT

and OFFSET

.

I'm going to take a column id

that is unique and sortable.

The first step is to execute COUNT(id)

and then select random 3 numbers from 0

to COUNT(id) - 1

in PHP. (How to do this is a separate question and the best approach depends on the number of lines and the amount you want).

The second step has two options. Suppose you selected random numbers: 0, 15, 2234. Or there is a loop in PHP



// $offsets = array(0, 15, 2234);
foreach ($offsets as $offset) {
    $rows[] = execute_sql('SELECT ... ORDER BY id LIMIT 1 OFFSET ?', $offset);
}

      

or build a UNION

. Note: This requires a subselection because we are using ORDER BY.

// $offsets = array(0, 15, 2234);
$query = '';
foreach ($offsets as $index => $offset) {
    if ($query) $query .= ' UNION ';
    $query .= 'SELECT * FROM (SELECT ... ORDER BY id LIMIT 1 OFFSET ?) Sub'.$index;
}
$rows = execute_sql($query, $offsets);

      

+2


source







All Articles