Using the kappa coefficient to evaluate crowd search results

Question

Using the kappa coefficient to evaluate crowd search results

I have 4 sets of manually tagged data for 0 and 1, 4 different people. I have to get the final tagged data in terms of 0 and 1 using 4 manually tagged datasets. I calculated the degree of agreement between users as AB: 0.3276, AC: 0.3263, AD: 0.4917, BC: 0.2896, BD: 0.4052, CD: 0.3540.

I don't know how to use this to compute the final data as one set. Please, help.

+3

nlp tagging manual crowdsourcing voting

bronn Jul 14 15 at 11:13

source to share

1 answer

Chthonic project · Answer 1 · 2015-07-14T13:34:51+0000

The Kappa coefficient only works for a pair of annotators. For more than two, you need to use the extension. One popular way to do this is to use this extension, proposed by Richard Sveta in 1971 , or use the average expected agreement for all annotator pairs, as suggested by Davis and Fleis in 1982 . I don't know of any readily available calculator that will calculate these for you, so you might have to implement the code yourself.

There is this Wikipedia page on the Fleiss folder that you might find helpful.

These methods can only be used for nominal variables. If your data does not fit the nominal scale, use another measure, such as the intraclass correlation coefficient .

Using the kappa coefficient to evaluate crowd search results

More articles: