How do I select N random rows using pure SQL?

Question

How do I select N random rows using pure SQL?

How to concatenate How to query a random string in SQL? and Multiple random values in SQL Server 2005 to select N random rows using a single pure-SQL query? Ideally, I would like to avoid using stored procedures if possible. Is it possible?

EXPLANATIONS

Pure SQL is as close to the ANSI / ISO standard as possible.
The solution must be "reasonably effective". The ORDER BY RAND () provided may work, but as others have pointed out, this is not possible for medium sized tables.

+8

sql random

Gili Dec 29. '09 at 1:20

source to share

5 answers

The answer to your question is in the second link:

SELECT * FROM table ORDER BY RAND() LIMIT 1

Just change the limit and / or rewrite for SQL Server:

SELECT TOP 1 * FROM table ORDER BY newid()

Now this strictly answers your question, but you really shouldn't use this solution. Just try it on a big table and you will see what I mean.

If your key space is consistent, either no holes or very few holes, and if it has very few holes, you are not too concerned that some rows have a slightly higher chance of being picked than others, then you can use a variation where you calculate which key you want to get randomly, starting from 1 to the highest key in your table, and then retrieving the first row that has a key equal to or greater than the number you calculated. You only need the "above" part if there are holes in your key space.

This SQL is left as an exercise for the reader.

Edit . Note. A comment on another answer here mentions that maybe pure SQL means ANSI SQL standard. If so, then there is no way, as there is no standardized random function and every database engine does not treat the random number function the same way. At least one engine I've seen "optimizes" the call by calling it once and simply repeating the computed value for all rows.

+5

Lasse Vågsæther Karlsen Dec 29. '09 at 1:25

source to share

Here's a potential solution that allows you to balance the risk of getting fewer than N rows versus offsetting the fetch from the "front" of the table. This assumes that N is small compared to the size of the table:

select * from table where random() < (N / (select count(1) from table)) limit N;

Typically this will display most of the table, but may return fewer than N rows. If some offset is acceptable, the numerator can be changed from N to 1.5 * N or 2 * N so that it is very likely that N rows will be returned. Also, if you need to randomize the order of the rows, rather than just pick an arbitrary subset:

select * from (select * from table
                where random() < (N / (select count(1) from table)) limit N)
 order by mod(tableid,1111);

The downside to this solution is that, at least in PostgreSQL, it uses a sequential table scan. A larger numerator will speed up the query.

+1

mtillberg 20 Mar At 15:50

source to share

This might help you:

SELECT TOP 3 * FROM TABLE ORDER BY NEWID()

-1

dipi evil Apr 17 13 at 19:18

source to share

Using the code below you can achieve what you are looking for.

select top 1 * from student1 order by newid()

change the value of N where top is 1 so you get this number of random entries.

-2

Mrityunjay Malliya 11 Apr '14 at 9:16

source to share

user12861 · Accepted Answer · 2008-12-30T18:56:39+0000

I don't know about pure ANSI and it's not easy, but you can check my answer to a similar question here: Simple random samples from Sql database

How do I select N random rows using pure SQL?

More articles: