Convert Matrix Multiplication to cuBLAS

The task is simple: I have two matrices A and B, which are M over N, where M → N. I want to first move the transposition of A and then multiply that by B (A ^ T * B) to put this in C, which is N to N. I have everything configured for A and B, but how can I call cublasSgemm correctly without returning the wrong answer?

I understand that cuBlas has a cublasOperation_t list to pre-pass things, but somehow I'm not using it quite correctly. My matrices A and B are in row order ie [row1] [row2] [row3] ..... in device memory. This means that in order for A to be interpreted as A-transposed, BLAS needs to know that my A is in the main column. My current code looks like this:

float *A, *B, *C;
// initialize A, B, C as device arrays, fill them with values
// initialize m = num_row_A, n = num_row_B, and k = num_col_A;
// set lda = m, ldb = k, ldc = m;
// alpha = 1, beta = 0;
// set up cuBlas handle ...

cublasSgemm(handle, CUBLAS_OP_T, CUBLAS_OP_N, m, n, k, &alpha, A, lda, B, ldb, &beta, C, ldc);

      

My questions:

Have I configured m, k, n correctly?

How about lda, ldb, ldc?

Thank!

+3


source to share


1 answer


Since cuBLAS always assume matrices are stored in the major column. You can migrate your matrices to colm-major first with cublas_geam (), or

You can treat your A matrix stored in a row as a new AT matrix stored in a column. The AT matrix is ​​actually a transposition of A. For B, do the same. Then you can compute the matrix C stored in column-to-column byC=AT * BT^T

float* AT = A;
float* BT = B;

      

The leading dimension is a storage-related parameter that does not change regardless of whether you use the transpose flag CUBLAS_OP_T

or not.

lda = num_col_A = num_row_AT = N;
ldb = num_col_B = num_row_BT = N;
ldc = num_row_C = N;

      

m

and n

in the cuBLAS GEMM routine: #rows and #cols of the C result matrix,

m = num_row_C = num_row_AT = num_col_A = N;
n = num_col_C = num_row_BT = num_col_B = N;

      



k

- general dimension A ^ T and B,

k = num_col_AT = num_row_B = M;

      

Then you can call the GEMM program with

cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, m, n, k, &alpha, AT, lda, BT, ldb, &beta, C, ldc);

      

If you want the matrix C to be stored in a row, you could calculate the CT stored in the major column using the formula CT = BT * AT^T

on

cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_T, n, m, k, &alpha, BT, ldb, AT, lda, &beta, CT, ldc);

      

Note that you do not need to swap m

and n

, since C is a square matrix in this case.

+11


source







All Articles