Fastest way to count the total and then list the recordset in MySQL

I have a SQL statement to select results from a table. I need to know the total number of records found and then list a subset of them (pagination).

Usually I would make 2 calls to SQL:

  • one to count the total number of records (using COUNT),
  • another to return a subset (using LIMIT).

But, in this way, you are indeed duplicating the same operation in MySQL: the WHERE clauses are the same in both calls.

Is there no way to get the speed NOT to duplicate selections in MySQL?

+2


source to share


7 replies


This first request will cause the data to be dumped into the cache, so let's assume the second request needs to be fast. I wouldn't worry too much about that.



+2


source


You have to do both SQL queries and COUNT is very fast, no WHERE clause. Cache data as needed.



+1


source


You should just run COUNT once and then cache it somewhere. Then you can simply run the pagination request if needed.

0


source


If you really don't want to run the query COUNT()

, and as others have argued, this is not something that slows things down noticeably - then you need to decide your chunk size (i.e. number LIMIT

), up front. This will save you the query COUNT()

, but you may end up with unsuccessful pagination results (e.g. 2 pages where the second page only has 1 result).

So, a quick COUNT()

and then a reasonable tweak, LIMIT

or no COUNT()

and arbitrary LIMIT

, which can increase the number of more expensive queries you have to execute.

0


source


You can try selecting just one field (say ids) and see if that helps, but I don't think that will happen - I guess the biggest overhead is MySQL finding the correct rows in the first place.

If you just want to count the total number of rows in the whole table (i.e. no sentence WHERE

) then I find it SELECT COUNT(*) FROM table

efficient enough.

Otherwise, the only solution if you need the total to show is to select all rows. However, you can cache this in another table. If you select something from a category, let's say keep the UID of the category and the selected common strings. Then, whenever you add / remove rows, recalculate the totals again.

Another option - although it might sacrifice usability a bit - is just selecting the lines needed for the current page and the next page. If there are multiple lines on the next page, add the Next link. Do the same for the previous page. If you have 20 lines per page, you select no more than 60 lines on every page load, and you don't have to count all available lines.

0


source


If you write your query to include one column containing the count (in each row) and then the rest of the columns from the second query, you can:

  • avoid the second database round trip (which is probably more expensive than your query anyway)
  • Increase the likelihood that the MySQL parser will generate an optimized execution plan that reuses the underlying query.
  • Make the operation an atom.

Unfortunately, it also creates a little repetition, returning more data than you really need. But I would expect it to be much more efficient anyway. This is the strategy that many ORM products use when they readily load objects from many-to-one or multi-relationship related tables.

0


source


As others have pointed out, in this case, you probably don't need to worry too much - as long as the "field" is indexed, both selections will be very fast.

If you have (for whatever reason) a situation where this is not enough, you can create a memory-based temporary table (i.e. a temporary table maintained by a memory storage engine) and fetch your records into the temporary table. Then you can make selections from the temporary table and be confident enough that they will be fast. This can use up a lot of memory (that is, it forces all the data to stay in memory for the time being), so it's pretty unfriendly if you're not sure if:

  1. The amount of data is really small;
  2. You have so much memory that it doesn't matter; or
  3. In any case, the machine will be almost inactive.

The main time that comes in handy is if you have a really tough choice that doesn't avoid scanning an entire large table (or more than one), but only yields a tiny amount of data.

0


source







All Articles