How to Calculate "Average Accuracy and Ranking" for a CBIR System

So a basic cbir system was implemented for me using RGB histograms. Now I am trying to create average precision and estimation curves. I need to know if my formula is correct for avg precision? and how to calculate the average rating?

Code:
% Dir: parent directory location for images folder c1, c2, c3
% inputImage: \c1\1.ppm
% For example to get P-R curve execute: CBIR('D:\visionImages','\c2\1.ppm');
function [  ] = demoCBIR( Dir,inputImage)
% Dir='D:\visionImages';
% inputImage='\c3\1.ppm';
tic;
S=strcat(Dir,inputImage);
Inp1=imread(S);
num_red_bins = 8;
num_green_bins = 8;
num_blue_bins = 8;
num_bins = num_red_bins*num_green_bins*num_blue_bins;

A = imcolourhist(Inp1, num_red_bins, num_green_bins, num_blue_bins);%input image histogram
srcFiles = dir(strcat(Dir,'\*.jpg'));  
B = zeros(num_bins, 100); % hisogram of other 100 images in category 1
ptr=1;
for i = 1 : length(srcFiles)
    filename = strcat(Dir,'\',srcFiles(i).name);
    I = imread(filename);% filter image
    B(:,ptr) = imcolourhist(I, num_red_bins, num_green_bins, num_blue_bins); 
    ptr=ptr+1;                                                   
end

%normal histogram intersection
a = size(A,2); b = size(B,2); 
K = zeros(a, b);
for i = 1:a
  Va = repmat(A(:,i),1,b);
  K(i,:) = 0.5*sum(Va + B - abs(Va - B));
end


  sims=K;
  for i=1: 100 % number of relevant images for dir 1
     relevant_IDs(i) = i;
  end

 num_relevant_images = numel(relevant_IDs);

 [sorted_sims, locs] = sort(sims, 'descend');
 locations_final = arrayfun(@(x) find(locs == x, 1), relevant_IDs);
 locations_sorted = sort(locations_final);
 precision = (1:num_relevant_images) ./ locations_sorted;
 recall = (1:num_relevant_images) / num_relevant_images;
 % generate Avg precision
 avgprec=sum(precision)/num_relevant_images;% avg precision formula
 plot(avgprec, 'b.-');
 xlabel('Category ID');
 ylabel('Average Precision');
 title('Average Precision Plot');
 axis([0 10 0 1.05]);
end 

      

+3


source to share


2 answers


Yes, that's right. You just add all of your precision values ​​and average them. This is the very definition of average accuracy.

Average precision is just one number (usually a percentage) that gives you the overall performance of the image search engine. The higher the value, the better the performance. Precision-Recall plots give you more detailed insight into how the system works, but medium precision is useful when you are comparing many image search engines together. Instead of building many PR graphs to try to compare the overall performance of many search engines, you can simply have a table that compares all systems together with one number that determines the performance of each, namely the average accuracy.

Also, it doesn't make sense to compose the average accuracy. When average accuracy is usually reported in scientific articles, there is no plot ... just one value! The only way I could see you doing this is by having a histogram, where the axis y

indicates the average precision and the axis x

indicates which search engine you are comparing. The higher the bar, the better the accuracy. However, a table showing all the different search engines, each with its own average precision, is more than adequate. This is what is usually done in most CBIR research work.




To solve another question, you calculate the average rank using the average precision. Calculate the average precision for all of your search engines you are testing, then sort based on that average precision. Systems that have higher average accuracy will score higher.

+2


source


This is what we use to calculate the average precision. There has to be a randomization step because you may have problems if you give discrete estimates of images in case of links if your images of truth on earth are at the top.



function ap = computeAP(label, score, gt)
    rand_index = randperm(length(label));
    label2 = label(rand_index);
    score = score(rand_index);
    [~, sids] = sort(score, 'descend');
    label2 = label2(sids);
    ids = find(label2 == gt);
    ap = 0;
    for j = 1:length(ids)
        ap  = ap + j / (ids(j) * length(ids));
    end
    fprintf('%f \n', ap);
end

      

0


source







All Articles