Optimizing Mysql Queries
I have the following SQL query:
select expr1, operator, expr2, count(*) as c
from log_keyword_fulltext
group by expr1, operator, expr2
order by c desc limit 2000;
Problem: count(*)
as part of my order, kills my app, possibly because it doesn't use an index. I would like to know if there is a way to do this faster, for example select
inside another select
, or something like that.
Mine select
explained:
+----+-------------+----------------------+-------+---------------+-------+---------+------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------------+-------+---------------+-------+---------+------+--------+----------------------------------------------+
| 1 | SIMPLE | log_keyword_fulltext | index | NULL | expr1 | 208 | NULL | 110000 | Using index; Using temporary; Using filesort |
+----+-------------+----------------------+-------+---------------+-------+---------+------+--------+----------------------------------------------+
UPDATED:
I tried to make a subquery like this
select * from (select b.expr1,b.operator,b.expr2,count(*) as c
from log_keyword_fulltext b group by b.expr1,b.operator,b.expr2) x
order by x.c desc limit 2000;
works but not faster, here is the explanation:
+----+-------------+------------+-------+---------------+-------+---------+------+--------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+-------+---------------+-------+---------+------+--------+----------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 38398 | Using filesort |
| 2 | DERIVED | b | index | NULL | expr1 | 208 | NULL | 110000 | Using index |
+----+-------------+------------+-------+---------------+-------+---------+------+--------+----------------+
Now you can test this, but don't use temporary, but still with the same performance. any recommendation?
source to share
Always try to take the number of distinct columns instead of taking count (*), as it takes a number in the permutation of each column of each row. So it takes longer
Eg:
select expr1, operator, expr2, count(expr1) as c
from log_keyword_fulltext
group by expr1, operator, expr2
order by c desc limit 2000;
source to share
What do you mean by "killing your application"? What's the context? How often do you run this query? What happens in the database while this query is being executed? Does this specific result need to be in real time? What are the conditions (inserts / s, selects / s, db size, etc.)
Here's what you could do:
-
Store the score in a separate table that you update with triggers on insert / delete
-
If you can't get MySQL to do it with simple table scrolling with some magic, try a stored procedure to do something like (pseudocode):
CREATE TEMP TABLE t (e1 EXP_T, op OP_T, e2 EXP_T, count INTEGER) ADD AN INDEX ON count FOR EACH LINE OF SELECT exp1,operator,exp2 FROM log_blah DO UPDATE t SET count=count+1 WHERE exp1=e1 AND operator=op AND exp2=e2 IF IT DOES NOT WORK INSERT INTO t VALUES (exp1,operator,exp2,1) DONE SELECT * FROM t ORDER BY count DESC LIMIT 2000
1 is probably what you want. And forget the indexes, this query has to loop through the whole table anyway.
source to share
The best way to prevent table scans is to add a cover index for fields that you regularly access. A one-time cost is required to create an index. There is also additional overhead for INSERT and DELETE operations on the table so that the index can be updated.
Cover indexes prevent the database from reading the entire record in memory to access the values ββfor multiple fields you care about. The entire query can be run from the index.
ALTER TABLE log_keyword_fulltext
ADD INDEX idx_name
(expr1, operator, expr2)
If these are not actual fields, but rather field operations such as left (foo, 20), you can actually index the portion of the field that you will use in future SELECT or WHERE clauses.
See this page for other optimization ideas: http://dev.mysql.com/doc/refman/5.0/en/query-speed.html
source to share
Trying to count and sort by it would be a killer. I would suggest trying to create a temporary table with invoices and then making a choice ... order that.
Not sure if this works in MySQL, but in PostreSQL or Oracle it will
create foo as
select expr1, operator, expr2, count(*) as c
from log_keyword_fulltext
group by expr1, operator, expr2;
select * from foo order by c desc limit 2000;
Also, you have to do all the calculations to sort them, so the limit clause does not prevent all of these calculations from being done.
source to share