CUDA 第三方庫 cula 應用示例
CULA 庫, 說白了就是LAPACK庫的cuda版, 實現了大部分LAPACK的函式功能, 而且函式命令以及引數都極其類似LAPACK, 不瞭解或者不知道LAPACK的, wiki 自己去wiki吧!
目前該庫, 有free版可以下載, 此版只能實現6個函式, 而且是單精度計算,如下
Type |
Description |
Real |
Complex |
---|---|---|---|
General | Solves a general system of linear equations AX=B. | SGESV | CGESV |
|
Computes an LU factorization of a general matrix, using partial pivoting with row interchanges. | SGETRF | CGETRF |
Computes a QR factorization of a general rectangular matrix. | SGEQRF | CGEQRF | |
Computes the least squares solution to an over-determined system of linear equations, AX=B, ATX=B, or AHX=B, or the minimum norm solution of an under-determined system, where A is a general rectangular matrix of full rank, using a QR or LQ factorization. | SGELS | CGELS | |
Solves the LSE (Constrained Linear Least Squares Problem) using the GRQ (Generalized RQ) factorization. | SGGLSE | CGGLSE | |
Computes the singular value decomposition (SVD) of a general rectangular matrix. | SGESVD | CGESVD |
以下程式碼示例怎麼使用其中一個函式 SGESVD, 矩陣奇異值分解, 應用廣泛, 該分解旨在分解任意一個矩陣A大小mxn成, 三個矩陣的積, A=U*S*V, 其中U大小 mxm, S 是對角矩陣對角線上的特徵值按照從大到小排列, V大小nxn.
CULA庫的這個函式, 對matlab 相應的SVD函式加速比已經超過20倍, 讓我們還是來看看怎麼用吧!
標頭檔案:
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
/* Setup SVD Parameters */
int LDA;
int LDU;
int LDVT;
int i,j;
int* dev=NULL;
float* A = NULL;
float* S = NULL;
float* U = NULL;
float* VT = NULL;
char jobu = 'A';
char jobvt = 'A';
/* 引數初始化*/
/*LDA LDU LDVT 都是用來指定矩陣同行兩個相鄰元素之間的物理儲存距離,由於cula lapack 都是按照列儲存矩陣所以此處的LDA LDVT 是矩陣中一列中包含的元素個數, 也就是行數.*/
LDA = m; //m矩陣A的行數 n矩陣A的列數
LDU = m;
LDVT = n;
A = (float*)malloc(m*n*sizeof(float));
S = (float*)malloc(imin(m,n)*sizeof(float));
U = (float*)malloc(LDU*m*sizeof(float));
VT = (float*)malloc(LDVT*n*sizeof(float));
/*初始化cula庫*/
/*culaSgesvd函式引數描述*/
/*Parameters
• jobu
– Type: char
– Direction: Input
Specifies options for computing all or part of the matrix U:
= ‘A’: all M columns of U are returned in array U:
= ‘S’: the first min(m,n) columns of U (the left singular vectors) are returned in the array U;
= ‘O’: the first min(m,n) columns of U (the left singular vectors) are overwritten on the array A;
= ‘N’: no columns of U (no left singular vectors) are computed.
• jobvt
– Type: char
– Direction: Input
Specifies options for computing all or part of the matrix VT :
= ‘A’: all N rows of VT are returned in the array VT;
= ‘S’: the first min(m,n) rows of VT (the right singular vectors) are returned in the array VT;
= ‘O’: the first min(m,n) rows of VT (the right singular vectors) are overwritten on the array A;
= ‘N’: no rows of VT (no right singular vectors) are computed.
JOBVT and JOBU cannot both be ‘O’.
• m
– Type: int
– Direction: Input
The number of rows of the input matrix A. M >= 0.
• n
– Type: int
– Direction: Input
The number of columns of the input matrix A. N >= 0.
• a
– Type: S/D/C/Z Pointer
– Direction: Input/Output
– Dimension: (LDA,N)
On entry, the M-by-N matrix A.
On exit,
if JOBU = ‘O’, A is overwritten with the first min(m,n) columns of U (the left singular vectors, stored
columnwise);
if JOBVT = ‘O’, A is overwritten with the first min(m,n) rows of VT (the right singular vectors,
stored rowwise);
if JOBU != ‘O’ and JOBVT != ‘O’, the contents of A are destroyed.
• lda
– Type: int
– Direction: Input
The leading dimension of the array A. LDA >= max(1,M).
• s
– Type: S/D Pointer
– Direction: Output
– Dimension: (min(M,N))
The singular values of A, sorted so that S(i) >= S(i+1).
• u
– Type: S/D/C/Z Pointer
– Direction: Output
– Dimension: (LDU,UCOL)
(LDU,M) if JOBU = ‘A’ or (LDU,min(M,N)) if JOBU = ‘S’. If JOBU = ‘A’, U contains the M-by-M
orthogonal/unitary matrix U; if JOBU = ‘S’, U contains the first min(m,n) columns of U (the left
singular vectors, stored columnwise); if JOBU = ‘N’ or ‘O’, U is not referenced.
• ldu
– Type: int
– Direction: Input
The leading dimension of the array U. LDU >= 1; if JOBU = ‘S’ or ‘A’, LDU >= M.
• vt
– Type: S/D/C/Z Pointer
– Direction: Output
– Dimension: (LDVT,N)
If JOBVT = ‘A’, VT contains the N-by-N orthogonal/unitary matrix VT ; if JOBVT = ‘S’, VT contains
the first min(m,n) rows of VT (the right singular vectors, stored rowwise); if JOBVT = ‘N’ or ‘O’,
VT is not referenced.
• ldvt
– Type: int
– Direction: Input
The leading dimension of the array VT. LDVT >= 1; if JOBVT = ‘A’, LDVT >= N; if JOBVT = ‘S’,
LDVT >= min(M,N)
*/
status=culaSelectDevice(2); // 選擇執行cula庫的GPU
status = culaInitialize(); //初始化
status = culaSgesvd(jobu, jobvt, m, n, A, LDA, S, U, LDU, VT, LDVT); //開始計算
本示例, 僅僅示範cula庫的一個函式應用, 有需要原始碼和示例程式的, 上cula官方網站下載免費版, 裡面有個小的sdk, 可以參考下http://www.culatools.com
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/20259129/viewspace-662452/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- CUDA函式的概念、種類和示例函式
- zt Worksheet物件 應用示例物件
- 第三方庫的一些知識與應用
- 成功案例分析 CUDA計算應用無極限
- 黑科技:把第三方 iOS 應用轉成動態庫iOS
- cuda函式庫介紹函式
- 原始碼開放:WebSocket應用示例原始碼Web
- Android不使用第三方升級庫實現應用升級Android
- 模擬.NET應用場景,綜合應用反編譯、第三方庫除錯、攔截、一庫多版本相容方案編譯除錯
- Hive 高階應用開發示例(一)Hive
- arguments的應用示例簡單介紹
- SOA參考架構的應用示例架構
- 一個簡單的 indexedDB 應用示例Index
- 使用Xamarin開發移動應用示例——數獨遊戲(六)使用資料庫遊戲資料庫
- 誰殺死了第三方應用商店?
- Qt 編寫應用支援多語言版本--一個GUI應用示例QTGUI
- Flutter應用的Clean架構示例專案Flutter架構
- Android 極簡反射教程及應用示例Android反射
- 寒假專案3-應用列舉(示例)
- 【徵文】應用oracle flashback(3.2)--Flashback Database操作示例OracleDatabase
- 低功耗4G模組:LCD應用示例
- 使用汽車應用庫構建應用
- thinkphp___微信第三方應用平臺PHP
- 快應用宣佈支援第三方DSL
- 特斯拉第三方應用開發指南(一)
- GitHub OAuth 第三方登入示例教程GithubOAuth
- B樹概述與簡單應用示例(C#)C#
- 詳解javascript拖拽(二)拖拽的應用及示例JavaScript
- ffmpeg綜合應用示例(五)——多路視訊合併
- React Native元件佈局應用示例小結React Native元件
- Django應用建立到啟動的簡單示例Django
- CUDA(五)用deviceQuery看GPU屬性devGPU
- PyTorch和CUDA版本對應關係PyTorch
- CUDA
- Android示例應用:開源框架Glide的使用Android框架IDE
- 【Python】python連結串列應用原始碼示例Python原始碼
- 應用示例薈萃 | performance_schema全方位介紹ORM
- AIGC神器CLIP:技術詳解及應用示例AIGC