Refine your search with the total for each item. PHP with MYSQL

What's the best way to accomplish a "running common" system like searching for tags on stackoverflow? If I click on "php" for example, it shows the total number of items on "all" other tags and very quickly. How can I do this in php using mysql?

+2


source to share


2 answers


This is a request that looks like this:

SELECT T2.Tag, COUNT(*)
FROM SO_Posts P1
JOIN Post_Tags T1 ON P.PostId = T1.PostId
JOIN Post_Tags T2 ON P.PostId = T2.PostId
GROUP BY T2.Tag
WHERE T1.Tag = 'PHP'
ORDER BY COUNT(*) DESC

      

This query makes a plausible assumption that posts (questions) on SO are stored in two tables.
 SO_Posts containing one post per post and containing information such as PostId (primary key), question itself, date, title, etc.
and
 Post_Tags, which associates a given message (by its Post_Id) with a tag (or rather a TagId, since tags should be normalized, but that's a detail). For a given Mail, there are as many entries in Post_Tags as there are tags attached to the post.
Note: The SO Posts database structure is more complex, with different tables to store comments, replies, etc., but in regards to Post-to-Tag relationships, this layout with 2 tables (or more likely 3 layout tables allowing you to have a tagId in Post_Tags, not the tag itself) captures the gist of how it is possible, easy and quick (given the correct indexes) to display these filtered aggregated accounts .



The idea is to find all PostIDs associated with the target tag (here "PHP") (searched in "T1") and then concatenate all posts (in "T2") using the tag.

Note that there is no main SO_Posts table here, but it will probably be part of the query, for example to add additional criteria such as post status (not closing ...).

+3


source


I would guess they are just using a simple select count(*) from questions where tag = $tagname

one that is cached in memcached. (<is the important part)



As the commenter pointed out, they can just keep track of the number of labels in a separate table. But you can't be sure - all we can do is guess. This does work, but the key is that you have to compare your application to see which approach works best for you. For everyone we know, the tag counter is not real time and the cron table is updated hourly or something.

0


source







All Articles