CCC Vector Extension C: How to move the contents of a vector to the left by one element?
I'm new to extending the GCC C vector. I'm considering using them in my project, but their usefulness (to some extent) depends on being able to efficiently move all elements in the vector one position to the left and store the result in the new vector. How can I do this efficiently (e.g. with SIMD acceleration)?
So basically:
- OriginalVector = {1, 2, 3, 4, 5, 6, 7, 8}
- ShiftedVector = {2, 3, 4, 5, 6, 7, 8, X} (where X can be anything.)
Background information (you can skip this): The purpose of this transformation is to refer to matrices, where each row is represented by vectors. In particular, it will allow ShiftedVector to be treated as the upper left diagonal for the line below and to compare all values ββin a single SIMD operation. If there is another way to compare the vector with a different one element offset of the vector, that will solve the problem as well. But I am not suggesting that the most efficient way to perform this comparison is to move all elements to the left and a 1: 1 comparison.
General Provisions:
- The original vector must not be damaged in the process
- Ok if I need to use the built-in x86 function , but I don't know what or how
- Ok if I loose the leftmost element in the vector and type gibberish into the rightmost
- Ok if the most efficient method is to lightly load the original vector from its second position to the end + 1, but I would still like to know how best to encode this
The bottleneck seems to be the lack of general information about the process of using inline functions. It seems people are either using assembly (which I'm not an expert) or auto-injection (which doesn't work here) , so vector types are the most logical choice.
Thank!
source to share
Scanning deep in the manual, I discovered this bit of tomfoolery:
typedef int v8si __attribute__ ((vector_size (32))); v8si OriginalVector, masker, ShiftedVector; OriginalVector = {1, 2, 3, 4, 5, 6, 7, 8}; masker = {1,2,3,4,5,6,7,0}; ShiftedVector = __builtin_shuffle(OriginalVector, masker);
Where I put 0 at the end of "masker" for no reason (any 0-7 item worked). What this does is simply to match the elements in the original to the positions defined in the mascaret and save them in the result.
But while this is the answer, it may not be the "best" answer, as I think there is a better way than creating a new vector, taking a register with a new vector, assigning positions, extracting each element of the place and putting it in another arbitrary place and save the result.
Yes, we can cache the mask outside of the loop or something instead of creating it every time, but I'm guessing there is some simple "swap left" command in there that might just shift it ...
source to share
The fastest shift is no shift at all (i.e. no move, no copying):
int Data[16] = {
1, 2, 3, 4, 5, 6, 7, 8,
0, 0, 0, 0, 0, 0, 0, 0,
};
int* Ptr = Data;
// first shift
Ptr++;
// second shift
Ptr++;
// and so on.
If the algorithm allows (that is, the number of shifts is limited and known in advance), it is possible to reserve enough space and make the "shifts" simply by increasing the pointer.
source to share