Is it possible to merge these (long) mysql queries?
Okay, my dilemma is this. I have an admin page that I use to get a summary of daily views on one of my sites. This sumamry page retrieves data from several tables, some of which store actual media data, others store URLs, IDs and website owners, and another table stores daily statistics to calculate I / O ratios for each domain.
This admin page contains a basic query that pulls in the actual media, which looks like this (simplified):
SELECT
content.con_catid,
content.con_posttime,
content.con_url,
content.con_id,
content.con_title,
categories.name,
users.username,
users.user_id,
FROM content
LEFT JOIN categories
ON (content.con_catid = categories.id)
LEFT JOIN users
ON (content.con_poster = users.user_id)
WHERE con_status = 0 ORDER BY con_posttime
Pretty straight forward. Then, this is where it gets messy. Every time it processes the mysql_fetch_array statement (to output each content item), I also run these 2 queries:
SELECT count(con_id) as pending FROM content WHERE con_poster = '$this_userid' AND con_status = 0
SELECT count(con_id) as approved FROM content WHERE con_poster = '$this_userid' AND con_status = 2
They get the number of pending and approved items for each user who owns the view (which is listed next to each view.
And then there is the killer that adds a lot of time to the execution time of the page, since it also has to be repeated for every line of the original query.
SELECT website_url,
website_id,
website_shady,
COALESCE(sum(dstats_hits),0)+website_in_ballast as total_in,
COALESCE(sum(dstats_hits_out),0)+website_out_ballast as total_out,
sum(dstats_hits_out_paid) as total_paid,
website_in_ballast,
website_out_ballast
FROM websites
LEFT JOIN domain_stats ON (websites.website_id = domain_stats.dstats_domid)
WHERE website_id IN (SELECT website_id FROM websites WHERE website_userid = $this_userid)
GROUP BY website_url
Is there any way to combine the last 3 queries in the first place? Because now .... if I have 100 items in the queue, then this is a 301 request (401 if you include a subquery) which takes a while to generate the page.
Here's another attempt at doing it in one giant query. Don't know how fast it will be, though ...
Edit: OK, third try. It should really be pretty fast if you set the correct indices.
SELECT
content.con_catid,
content.con_posttime,
content.con_url,
content.con_id,
content.con_title,
categories.name,
users.username,
users.user_id,
stats1.website_url,
websites.website_id,
websites.website_shady,
websites.website_in_ballast,
websites.website_out_ballast,
(SELECT COALESCE(sum(dstats_hits),0)+website.website_in_ballast FROM domain_stats WHERE websites.website_id = domain_stats.dstats_domid) as total_in,
(SELECT COALESCE(sum(dstats_hits_out),0)+website_out_ballast FROM domain_stats WHERE websites.website_id = domain_stats.dstats_domid) as total_out,
(SELECT sum(dstats_hits_out_paid) FROM domain_stats WHERE websites.website_id = domain_stats.dstats_domid) as total_paid,
(SELECT count(c2.con_id) FROM content c2 WHERE c2.con_poster = user.user_id AND con_status = 0) as pending,
(SELECT count(c2.con_id) FROM content c2 WHERE c2.con_poster = user.user_id AND con_status = 2) as approved
FROM
content
LEFT JOIN categories ON (content.con_catid = categories.id)
LEFT JOIN users ON (content.con_poster = users.user_id)
LEFT JOIN websites ON (website_userid=users.user_id)
WHERE
con_status = 0
ORDER BY
con_posttime
source to share
Hmm ... the first query just fetches content items, while the other queries fetch some statistics about them. The first query does not use any aggregate functions, but all others. So the only way I can join them is using subqueries. This won't make the execution plan much better, but it will save you a couple hundred queries and result set generations, so it should be somewhat faster.
One tip: the middle two queries can be combined like this:
SELECT
count(case when con_status=0 then 1 else null end) as pending,
count(case when con_status=2 then 1 else null end) as approved
FROM
content
WHERE
con_poster = '$this_userid'
source to share
Joining the content again for a specific user_id will give you the ability to count as well.
SELECT content.con_catid, content.con_posttime, content.con_url, content.con_id, content.con_title, categories.name, users.username, users.user_id, IFNULL (SUM (CASE WHEN c2.con_status = 0 THEN 1 ELSE 0 END), 0) as pending, IFNULL (SUM (CASE WHEN c2.con_status = 2 THEN 1 ELSE 0 END), 0) as approved FROM content LEFT JOIN categories ON (content.con_catid = categories.id) LEFT JOIN users ON (content.con_poster = users.user_id) LEFT JOIN content as c2 ON users.user_id = c2.con_poster WHERE con_status = 0 GROUP BY content.con_id ORDER BY con_posttime
This has not been verified. You might be able to double join the content and use COUNT (DISTINCT c2.con_id) AS to wait and COUNT (DISTINCT c3.con_id) AS approved, if the CASE statement is slowing down then you add c2.con_status = 0 to your connection (for waiting messages).
IFNULL exists just in case you didn't have any messages (which will never be true, since the main content line will always exist. You can disallow the current entry in the JOIN condition if you don't want to read it.)
For the last request, I would select all the information for all users you process (keeping all IDs, then call them with user_id IN () or a subplot (depending on how you make your choice).
Then sorting and processing the data in code. This will minimize the number of requests.
source to share