How to get an array of sums from an array

I am new to openCL.

I know how to sum a 1D array. But my question is how to get an array of sums from 1 1D array in openCL.

 int a[1000];
 int b[1000];
 ....             //save data to a
 for(int i = 0 ;i < 1000; i ++){
    int sum = 0;
      for(int j = 0 ;j < i; j ++){
        sum += a[j];
      }
      b[i] = sum;
  }

      

Any suggestion is greatly appreciated.

+3


source to share


2 answers


As others have said, you must use the inclusive parallel prefix sum. If you are allowed to use OpenCL 2, they have a workgroup feature for it - they should have had it there from the start because of how often it is used - so now we have everyone implementing it, often this way or otherwise bad.

See http://http.developer.nvidia.com/GPUGems3/gpugems3_ch39.html for typical algorithms for teaching this.



The number you mention doesn't really make sense to use multiple computational units, which means you will be attacking it with one computational unit - so just repeat the loop twice or so - on 64-256, you will get the sum of so many elements very much fast. Creating workgroup functions to obtain general pruning functions for any size is an exercise for the reader.

0


source


This is a consistent problem. Expressed in a different way

b[1] = a[0]
b[2] = b[1] + a[1]
b[3] = b[2] + a[2]
...
b[1000] = b[9999] + a[999]

      



Hence, having multiple threads won't help you at all. The most optimal way to do this is to use one processor. And not OpenCL / CUDA / OpenMP ...

This problem is completely different from reduction, each step can be split into two smaller steps that can be performed in parallel.

0


source







All Articles