Loading an array of integers into AVX register

I am currently looking at AVX Intrinsics to parallelize my code. At this point, I would like to write a benchmark, see how much speedup I can get.

void randomtable (uint32_t crypto[4][64])
{
    int k = 1;
    for (int i=0;i<4;i++)
    {
        k++;
        for (int j=0;j<64;j++)
        { crypto[i][j]= (k+j)%64; }
    }
}
int main (void)
{
uint32_t crypt0[4][64];
randomtable(crypt0);
__m256i ymm0 = _m256_load_si256(&crypt0[0][0]);
}

      

My problem and question is how to load the first 8 elements of an array into ymm0?

I am compiling with gcc -mavx -march = native -g -O0 -std = c99

compilation error: error: incompatible types when initializing type '__m256i' using type 'int'

+3


source to share


1 answer


This line contains a typo and a missing listing:

__m256i ymm0 = _m256_load_si256(&crypt0[0][0]);

      

It should be:



__m256i ymm0 = _mm256_load_si256((__m256i *)&crypt0[0][0]);

      

Note that you will probably need to use AVX2 if you want to do anything else with the data (i.e. integer arithmetic, etc.), so you must compile with -mavx2

.

+5


source







All Articles