SQL: most common meaning for a single identifier

My table named GameEvent is displayed like this:

+---------------+-----------+
| sessionNumber | colorName |
+---------------+-----------+
|             1 | Green     |
|             1 | Red       |
|             1 | Green     |
|             1 | Blue      |
|             2 | Blue      |
+---------------+-----------+

      

I want to make a request that will return sessionNumber and the most common color like sessionNumer 1 -> Green, sessionNumber 2 -> Blue. How to do it?

+3


source to share


2 answers


You start with a basic aggregation query to get the counts for sessions and colors:

select sessionNumber, colorName, count(*) as cnt
from GameEvent ge
group by sessionNumber, colorName;

      

Then you really want MySQL to support the ANSI window functionality - the only main database that doesn't row_number()

. But alas, this functionality is not available. So, you have three options:

  • String aggregation tricks.
  • Variables.
  • Complex SQL.


The first method is shown below:

select sessionNumber,
       substring_index(group_concat(colorName order by cnt desc), ',', 1) as MostCommonColor
from (select sessionNumber, colorName, count(*) as cnt
      from GameEvent ge
      group by sessionNumber, colorName
     ) sc
group by sessionNumber;

      

Please note that there are limitations for this request. Intermediate results from group_concat()

have a maximum length (controlled by a parameter that can be changed). Also, if the color names contain commas, you would like to use a different character.

+5


source


What you can do is select the number of each pair using a sentence GROUP BY

and a function COUNT()

like this:

SELECT sessionNumber, colorName, COUNT(*) AS numOccurrences
FROM myTable
GROUP BY sessionNumber, colorName;

      

After that, you will need to find the largest number of occurrences for each session number. This can be achieved by reusing aggregation:

SELECT sessionNumber, MAX(numOccurrences) AS largestCount
FROM(
  SELECT sessionNumber, colorName, COUNT(*) AS numOccurrences
  FROM myTable
  GROUP BY sessionNumber, colorName) tmp
GROUP BY sessionNumber;

      



The last thing you need to do is join the first subquery, assuming the session number and counter are the same, and you will get the required session number and color:

SELECT t1.sessionNumber, t1.colorName
FROM(
  SELECT sessionNumber, colorName, COUNT(*) AS numOccurrences
  FROM myTable
  GROUP BY sessionNumber, colorName) t1
JOIN(
  SELECT sessionNumber, MAX(numOccurrences) AS largestCount
  FROM(
    SELECT sessionNumber, colorName, COUNT(*) AS numOccurrences
    FROM myTable
    GROUP BY sessionNumber, colorName) tmp
  GROUP BY sessionNumber) t2
ON t2.sessionNumber = t1.sessionNumber AND t2.largestCount = t1.numOccurrences;

      

Here is a SQL Fiddle example that shows all three steps. It is important to note that you cannot use LIMIT here, because this clause limits your entire result set, not each group separately.

+3


source







All Articles