Loading an array of integers into AVX register
I am currently looking at AVX Intrinsics to parallelize my code. At this point, I would like to write a benchmark, see how much speedup I can get.
void randomtable (uint32_t crypto[4][64])
{
int k = 1;
for (int i=0;i<4;i++)
{
k++;
for (int j=0;j<64;j++)
{ crypto[i][j]= (k+j)%64; }
}
}
int main (void)
{
uint32_t crypt0[4][64];
randomtable(crypt0);
__m256i ymm0 = _m256_load_si256(&crypt0[0][0]);
}
My problem and question is how to load the first 8 elements of an array into ymm0?
I am compiling with gcc -mavx -march = native -g -O0 -std = c99
compilation error: error: incompatible types when initializing type '__m256i' using type 'int'
+3
source to share
1 answer
This line contains a typo and a missing listing:
__m256i ymm0 = _m256_load_si256(&crypt0[0][0]);
It should be:
__m256i ymm0 = _mm256_load_si256((__m256i *)&crypt0[0][0]);
Note that you will probably need to use AVX2 if you want to do anything else with the data (i.e. integer arithmetic, etc.), so you must compile with -mavx2
.
+5
source to share