Finding a faster way to work with cells and vectors

I have a list of cells, each element of which contains a different number of coordinates to access the vector. For example,

C ={ [1 2 3] , [4 5],  [6], [1 8 9 12 20]}

      

this is just an example, in real case C has size from 10 ^ 4 to 10 ^ 6, each element contains a vector from 1 to 1000 elements. I need to use each element as coordinates to access the corresponding elements in the vector. I am using a loop to find the average of vector elements given by cell elements

 for n=1:size(C,1)
   x = mean(X(C{n}));
   % put x to somewhere  
 end

      

here X is a large vector of 10,000 elements. Using a loop is fine, but I'm wondering if there is any way to do the same, but without using a loop? The reason I am asking is the above code needs to be run so many times and it is quite slow to use lopp right now.

+3


source to share


1 answer


Approach # 1

C_num = char(C{:})-0; %// 2D numeric array from C with cells of lesser elements 
             %// being filled with 32, which is the ascii equivalent of space

mask = C_num==32; %// get mask for the spaces
C_num(mask)=1; %// replace the numbers in those spaces with ones, so that we 
                %// can index into x witout throwing any out-of-extent error

X_array = X(C_num); %// 2D array obtained after indexing into X with C_num
X_array(mask) = nan; %// set the earlier invalid space indices with nans
x = nanmean(X_array,2); %// final output of mean values neglecting the nans

      


Approach # 2

lens = cellfun('length',C); %// Lengths of each cell in C
maxlens = max(lens); %// max of those lengths

%// Create a mask array with no. of rows as maxlens and columns as no. of cells. 
%// In each column, we would put numbers from each cell starting from top until
%// the number of elements in that cell. The ones(true) in this mask would be the 
%// ones where those numbers are to be put and zeros(false) otherwise.
mask = bsxfun(@le,[1:maxlens]',lens) ; %//'

C_num = ones(maxlens,numel(lens)); %// An array where the numbers from C are to be put

C_num(mask) = [C{:}]; %// Put those numbers from C in C_num.
  %// NOTE: For performance you can also try out: double(sprintf('%s',C{:}))
X_array = X(C_num); %// Get the corresponding X elements
X_array(mask==0) = nan; %// Set the invalid locations to be NaNs
x = nanmean(X_array); %// Get the desired output of mean values for each cell

      




Approach # 3

This will be pretty much the same as Approach # 2, but with some changes at the end to avoid nanmean

.

Thus, edit the last two lines from approach # 2, to them -

X_array(mask1==0) = 0;
x = sum(X_array)./lens;

      

0


source







All Articles