CUDA 第三方庫 cula 應用示例

lizecn發表於2010-05-11

CULA 庫, 說白了就是LAPACK庫的cuda版, 實現了大部分LAPACK的函式功能, 而且函式命令以及引數都極其類似LAPACK, 不瞭解或者不知道LAPACK的, wiki 自己去wiki吧!

目前該庫, 有free版可以下載, 此版只能實現6個函式, 而且是單精度計算,如下

Type
Description
Real
Complex
General Solves a general system of linear equations AX=B. SGESV CGESV

 

Computes an LU factorization of a general matrix, using partial pivoting with row interchanges. SGETRF CGETRF
Computes a QR factorization of a general rectangular matrix. SGEQRF CGEQRF
Computes the least squares solution to an over-determined system of linear equations, AX=B, ATX=B, or AHX=B, or the minimum norm solution of an under-determined system, where A is a general rectangular matrix of full rank, using a QR or LQ factorization. SGELS CGELS
Solves the LSE (Constrained Linear Least Squares Problem) using the GRQ (Generalized RQ) factorization. SGGLSE CGGLSE
Computes the singular value decomposition (SVD) of a general rectangular matrix. SGESVD CGESVD

以下程式碼示例怎麼使用其中一個函式 SGESVD, 矩陣奇異值分解, 應用廣泛, 該分解旨在分解任意一個矩陣A大小mxn成, 三個矩陣的積, A=U*S*V, 其中U大小 mxm, S 是對角矩陣對角線上的特徵值按照從大到小排列, V大小nxn.

CULA庫的這個函式, 對matlab 相應的SVD函式加速比已經超過20倍, 讓我們還是來看看怎麼用吧!

標頭檔案:

#include
#include
#include
#include
#include
#include
#include       //cula 庫的標頭檔案
#include
#include
#include


/* Setup SVD Parameters */
    int LDA;
    int LDU;
    int LDVT;
    int i,j;
    int* dev=NULL;
    float* A = NULL;
    float* S = NULL;
    float* U = NULL;
    float* VT = NULL;
    char jobu = 'A';
    char jobvt = 'A';

   /* 引數初始化*/ 

/*LDA LDU LDVT 都是用來指定矩陣同行兩個相鄰元素之間的物理儲存距離,由於cula lapack 都是按照列儲存矩陣所以此處的LDA LDVT 是矩陣中一列中包含的元素個數, 也就是行數.*/

    LDA = m;   //m矩陣A的行數 n矩陣A的列數
    LDU = m;
    LDVT = n;
    A = (float*)malloc(m*n*sizeof(float));
    S = (float*)malloc(imin(m,n)*sizeof(float));
    U = (float*)malloc(LDU*m*sizeof(float));
    VT = (float*)malloc(LDVT*n*sizeof(float));


/*初始化cula庫*/

/*culaSgesvd函式引數描述*/

/*Parameters
• jobu
– Type: char
– Direction: Input
Specifies options for computing all or part of the matrix U:
= ‘A’: all M columns of U are returned in array U:
= ‘S’: the first min(m,n) columns of U (the left singular vectors) are returned in the array U;
= ‘O’: the first min(m,n) columns of U (the left singular vectors) are overwritten on the array A;
= ‘N’: no columns of U (no left singular vectors) are computed.
• jobvt
– Type: char
– Direction: Input
Specifies options for computing all or part of the matrix VT :
= ‘A’: all N rows of VT are returned in the array VT;
= ‘S’: the first min(m,n) rows of VT (the right singular vectors) are returned in the array VT;
= ‘O’: the first min(m,n) rows of VT (the right singular vectors) are overwritten on the array A;
= ‘N’: no rows of VT (no right singular vectors) are computed.
JOBVT and JOBU cannot both be ‘O’.
• m
– Type: int
– Direction: Input
The number of rows of the input matrix A. M >= 0.
• n
– Type: int
– Direction: Input
The number of columns of the input matrix A. N >= 0.
• a
– Type: S/D/C/Z Pointer
– Direction: Input/Output
– Dimension: (LDA,N)
On entry, the M-by-N matrix A.
On exit,
if JOBU = ‘O’, A is overwritten with the first min(m,n) columns of U (the left singular vectors, stored
columnwise);
if JOBVT = ‘O’, A is overwritten with the first min(m,n) rows of VT (the right singular vectors,
stored rowwise);
if JOBU != ‘O’ and JOBVT != ‘O’, the contents of A are destroyed.
• lda
– Type: int
– Direction: Input
The leading dimension of the array A. LDA >= max(1,M).
• s
– Type: S/D Pointer
– Direction: Output
– Dimension: (min(M,N))
The singular values of A, sorted so that S(i) >= S(i+1).
• u
– Type: S/D/C/Z Pointer
– Direction: Output
– Dimension: (LDU,UCOL)
(LDU,M) if JOBU = ‘A’ or (LDU,min(M,N)) if JOBU = ‘S’. If JOBU = ‘A’, U contains the M-by-M
orthogonal/unitary matrix U; if JOBU = ‘S’, U contains the first min(m,n) columns of U (the left
singular vectors, stored columnwise); if JOBU = ‘N’ or ‘O’, U is not referenced.
• ldu
– Type: int
– Direction: Input
The leading dimension of the array U. LDU >= 1; if JOBU = ‘S’ or ‘A’, LDU >= M.
• vt
– Type: S/D/C/Z Pointer
– Direction: Output
– Dimension: (LDVT,N)
If JOBVT = ‘A’, VT contains the N-by-N orthogonal/unitary matrix VT ; if JOBVT = ‘S’, VT contains
the first min(m,n) rows of VT (the right singular vectors, stored rowwise); if JOBVT = ‘N’ or ‘O’,
VT is not referenced.
• ldvt
– Type: int
– Direction: Input
The leading dimension of the array VT. LDVT >= 1; if JOBVT = ‘A’, LDVT >= N; if JOBVT = ‘S’,
LDVT >= min(M,N)

*/

status=culaSelectDevice(2);  // 選擇執行cula庫的GPU

status = culaInitialize();  //初始化

status = culaSgesvd(jobu, jobvt, m, n, A, LDA, S, U, LDU, VT, LDVT); //開始計算

 

本示例, 僅僅示範cula庫的一個函式應用, 有需要原始碼和示例程式的, 上cula官方網站下載免費版, 裡面有個小的sdk, 可以參考下http://www.culatools.com

 

   

 

來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/20259129/viewspace-662452/,如需轉載,請註明出處,否則將追究法律責任。

相關文章