Weighted average

I have an existing web application that allows users to "grade" items based on their difficulty. (0 to 15). Currently I am just taking the average for each user opinion and presenting the average from MySQL. However, it is becoming clear to me (and my users) that weighting the numbers would be more appropriate.

Oddly enough, the few hours I spent at Google didn't grow much. I found two articles that showed the site's rating systems based on "Bayesian filters" (which I partially understand). Here's one example:

Formula:

WR = (V / (V + M)) * R + (M / (V + M)) * C

Where:

* WR=Weighted Rating (The new rating)
* R=Average Rating (arithmetic mean) so far
* V=Number of ratings given
* M=Minimum number of ratings needed
* C=Arithmetic mean rating across the whole site

      

I love the idea of ​​increasing the weighting based on the total votes for an item ... however, since the difficulty levels on my site can vary dramatically from item to item, assuming "C" (site-wide average) is not valid.

So, to repeat my question:

Using MySQL, PHP, or both, I am trying to get the arithmetic mean:

(5 + 5 + 4)/3 = 4.67 (rounded)

      

... to the weighted average:

rating  / weight
5 / 2 (since it was given 2 times)
5 / 2
4 / 1

(sum[(rate * weight)])/(sum of weights)
(5 * 2) + (5 * 2) + (4 * 1) / (2 + 2 + 1)
(24)/(5)
= 4.8

      

+1


source to share


2 answers


This is a simple example of how to do this directly in MySQL. You will, of course, need to add a condition to the subquery to get only votes for the item in question, not all votes.



mysql> create table votes (vote int);
Query OK, 0 rows affected (0.01 sec)

mysql> insert into votes values ​​(5), (5), (4);
Query OK, 3 row affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0

mysql> select * from votes;
+ ------ +
| vote |
+ ------ +
| 5 |
| 5 |
| 4 |
+ ------ +
3 rows in set (0.00 sec)

mysql> select vote, count (vote), vote * count (vote) from votes group by vote;
+ ------ + ------------- + ------------------ +
| vote | count (vote) | vote * count (vote) |
+ ------ + ------------- + ------------------ +
| 4 | 1 | 4 |
| 5 | 4 | 20 |
+ ------ + ------------- + ------------------ +
2 rows in set (0.00 sec)

mysql> select sum (vt) / sum (cnt) FROM (select 
count (vote) * count (vote) as cnt, vote * count (vote) * count (vote) 
as vt from votes group by vote) a;
+ ------------------ +
| sum (vt) / sum (cnt) |
+ ------------------ +
| 4.8000 |
+ ------------------ +
1 row in set (0.00 sec)


+4


source


What made it clear that weighing would be more appropriate? What do you see in the arithmetic value that is not useful to you? I am curious because it seems that the answer you are looking for may not always suit your needs. (Also, a 16-point scale tends to be much larger than what most people need; people rarely distinguish so many points and tend to group their answers around a select group of responses.)

The concept you are associated with pulls the average towards the site average; your meaning is simply pulling itself towards the most frequent answer. Typically, if you are using an average and want you to rate the responses, you would do so based on something about the respondents (giving more weight to the responses from more knowledgeable people, people who frequent the site, or other such things).

You might also consider using calculations other than average scores, maybe the top-N-box percentage (percentage of respondents giving N difficulty ratings).

Otherwise the formula for your average means sum (answer * counter * invoice) / sum (counter * invoice) ...



select sum(response*ct*ct)/sum(ct*ct) from
( select response, count(response) as ct from your_table group by response) data

      

Apologies if the syntax is not accurate, I don't have MySQL at work.

Note that you may need to convert sums from ints to floats; not sure exactly how this works in MySQL. In SQL Server, you must specify one of the sums so that it understands that you do not need a cumulative average.

0


source







All Articles