Get the top and bottom 25th percentile

I have a table listing employees and the number of units they sold.

I want to get 25 percent Avg units sold, and 25 percent Avg units sold and sold.

I created my data view SLQ Fiddle

I really don't know how to start with this? All examples I see are for SQL Server, not MySQL. That's what I think.

I want 25 percent and can't limit 25 points. This will mainly include:

1) #_of_employees = The number of total employees. 

2) #_of_employees_in_25_percentile =  #_of_employees*0.25 

3) Calculate the sum of the units sold by the top/bottom 25 percentile (limit #_of_employees_in_25_percentile)

4) Divide the sum by #_of_employees_in_25_percentile to get the average.

      

How can all of this be done efficiently in MySQL?

+1


source to share


3 answers


This is a solution that uses a tricky trick I learned from this question .

SELECT id, unit_sold, n * 100 / @total AS percentile
FROM (
  SELECT id, unit_sold, @total := @total + unit_sold AS n
  FROM mydata, (SELECT @total := 0) AS total
  ORDER BY unit_sold ASC
) AS t

      



SQL Fiddle.

+2


source


How about going with this logic:

  • Select all, order by percentile (DESC), limit to 25
  • Select all, order by percentile (ASC), limit to 25

Is this the type of logic you are looking for?



Examples of requests:

$q1 = mysql_query(SELECT * FROM table_name ORDER BY percentile DESC LIMIT 25)
$q2 = mysql_query(SELECT * FROM table_name ORDER BY percentile ASC LIMIT 25)

      

0


source


How about this?

SELECT
  SUM(unit_sold) AS sum_tot, SUM(unit_sold)/count(id) AS average,
  SUM(CASE WHEN percentile<25 THEN unit_sold ELSE 0 END) AS sum_top25,
  SUM(CASE WHEN percentile<25 THEN 1 ELSE 0 END) AS count_top25,
  SUM(CASE WHEN percentile<25 THEN unit_sold ELSE 0 END)/SUM(CASE WHEN percentile<25 THEN 1 ELSE 0 END) AS average_top25,
  SUM(CASE WHEN percentile>75 THEN unit_sold ELSE 0 END) AS sum_bottom25,
  SUM(CASE WHEN percentile>75 THEN 1 ELSE 0 END) AS count_bottom25,
  SUM(CASE WHEN percentile>75 THEN unit_sold ELSE 0 END)/SUM(CASE WHEN percentile>75 THEN 1 ELSE 0 END) AS average_bottom25
FROM
(SELECT
  id, unit_sold, c * 100 / @counter AS percentile
FROM
   (SELECT
      m.*, @counter:=@counter+1 AS c 
   FROM
     (SELECT @counter:=0) AS initvar, mydata AS m
   ORDER BY unit_sold desc
   ) AS t
WHERE
  c <= (25/100 * @counter)
  OR c >= (75/100 * @counter)
) AS t2

      

Output:

SUM_TOT AVERAGE     SUM_TOP25   COUNT_TOP25 AVERAGE_TOP25 SUM_BOTTOM25  COUNT_BOTTOM25  AVERAGE_BOTTOM25
850     283.3333    500         1            500          350           2               175

      

See SQL Fiddle .

The idea is to use MySQL: LIMIT on a percentage of the number of records? to get the percentiles. Based on this (and pdw's answer ), we create an output where we only show 25% and bottom 75%.

Finally, we count and sum to get the requested values.


Note that this runs on top of the command:

SELECT
  id, unit_sold, c * 100 / @counter AS percentile
FROM
   (SELECT
      m.*, @counter:=@counter+1 AS c 
   FROM
     (SELECT @counter:=0) AS initvar, mydata AS m
   ORDER BY unit_sold desc
   ) AS t
WHERE
  c <= (25/100 * @counter)
  OR c >= (75/100 * @counter)

      

Whose output:

ID  UNIT_SOLD   PERCENTILE
d   500         20
a   250         80
e   100         100

      

0


source







All Articles