How do I vectorize an array reformat?
I have a CSV file with data in each line in the format (x,y,z,t,f)
, where f
is the value of some function at a location (x,y,z)
at a point in time t
. Thus, each new line in the .csv gives a new set of coordinates (x,y,z,t)
with an accompanying value f
. CSV is not sorted.
I want to use -plane imagesc
to create a video of this data xy
over time. The way I did it was reformatting M to something easier to use imagesc
. I am doing three nested loops, something like this
M = csvread('file.csv'); uniqueX = unique(M(:,1)); uniqueY = unique(M(:,2)); uniqueT = unique(M(:,4)); M_reformatted = zeros(length(uniqueX), length(uniqueY), length(uniqueT)); for i = 1:length(uniqueX) for j = 1:length(uniqueY) for k = 1:length(uniqueT) M_reformatted(i,j,k) = M( ... M(:,1)==uniqueX(i) & ... M(:,2)==uniqueY(j) & ... M(:,4)==uniqueT(k), ... 5 ... ); end end end
Once I have it M_reformatted
, I can loop through the timesteps k
and use imagesc
on M_reformatted(:,:,k)
. But the execution of the above nested loops is very slow. Can the above be vectorized? If so, an outline of the approach would be very helpful.
edit: as noted in the answers / comments below, I made a mistake in that there are several possible z values that I didn't take into account. If there is only one z-value, then that would be fine.
source to share
This vectorized solution allows negative values x
and y
is many times faster than the non-vectorized solution (almost 20 times for the test case below).
The idea is to sort the values x
, y
and t
in lexicographic order with sortrows
, and then with, reshape
plot time slices M_reformatted
.
Code:
idx = find(M(:,3)==0); %// find rows where z==0
M2 = M(idx,:); %// M2 has only the rows where z==0
M2(:,3) = []; %// delete z coordinate in M2
M2(:,[1 2 3]) = M2(:,[3 1 2]); %// change from (x,y,t,f) to (t,x,y,f)
M2 = sortrows(M2); %// sort rows by t, then x, then y
numT = numel(unique(M2(:,1))); %// number of unique t values
numX = numel(unique(M2(:,2))); %// number of unique x values
numY = numel(unique(M2(:,3))); %// number of unique y values
%// fill the time slice matrix with data
M_reformatted = reshape(M2(:,4), numY, numX, numT);
Note . I am assuming it y
refers to the columns of the image and x
refers to the rows. If you want them to flip, use M_reformatted = permute(M_reformatted,[2 1 3])
at the end of the code.
Test case, I used to M
(for the result of the comparison with other solutions), it has a space NxNxN
with a t
time slice:
N = 10; T = 10; [x,y,z] = meshgrid(-N:N,-N:N,-N:N); numPoints = numel(x); x=x(:); y=y(:); z=z(:); s = repmat([x,y,z],T,1); t = repmat(1:T,numPoints,1); M = [s, t(:), rand(numPoints*T,1)]; M = M( randperm(size(M,1)), : );
source to share
I don't think you need to vectorize. I think you will change your algorithm.
You only need one loop to step through the lines of the CSV file. For each row, you have (x, y, z, t, f), so just store it in M_reformatted
where it belongs. Something like that:
M_reformatted = zeros(max(M(:,1)), max(M(:,2)), max(M(:,4)));
for line = 1:size(M,2)
z = M(line, 3);
if z ~= 0, continue; end;
x = M(line, 1);
y = M(line, 2);
t = M(line, 4);
f = M(line, 5);
M_reformatted(x, y, t) = f;
end
Also note that preallocation M_reformatted
is a very good idea, but your code may have been looking for size (depending on the data). I think using max
like me will always do the right thing.
source to share