CUDA 第三方庫 cula 應用示例
CULA 庫, 說白了就是LAPACK庫的cuda版, 實現了大部分LAPACK的函式功能, 而且函式命令以及引數都極其類似LAPACK, 不瞭解或者不知道LAPACK的, wiki 自己去wiki吧!
目前該庫, 有free版可以下載, 此版只能實現6個函式, 而且是單精度計算,如下
Type |
Description |
Real |
Complex |
---|---|---|---|
General | Solves a general system of linear equations AX=B. | SGESV | CGESV |
|
Computes an LU factorization of a general matrix, using partial pivoting with row interchanges. | SGETRF | CGETRF |
Computes a QR factorization of a general rectangular matrix. | SGEQRF | CGEQRF | |
Computes the least squares solution to an over-determined system of linear equations, AX=B, ATX=B, or AHX=B, or the minimum norm solution of an under-determined system, where A is a general rectangular matrix of full rank, using a QR or LQ factorization. | SGELS | CGELS | |
Solves the LSE (Constrained Linear Least Squares Problem) using the GRQ (Generalized RQ) factorization. | SGGLSE | CGGLSE | |
Computes the singular value decomposition (SVD) of a general rectangular matrix. | SGESVD | CGESVD |
以下程式碼示例怎麼使用其中一個函式 SGESVD, 矩陣奇異值分解, 應用廣泛, 該分解旨在分解任意一個矩陣A大小mxn成, 三個矩陣的積, A=U*S*V, 其中U大小 mxm, S 是對角矩陣對角線上的特徵值按照從大到小排列, V大小nxn.
CULA庫的這個函式, 對matlab 相應的SVD函式加速比已經超過20倍, 讓我們還是來看看怎麼用吧!
標頭檔案:
#include
#include
#include
#include
#include
#include
#include
#include
#include
#include
/* Setup SVD Parameters */
int LDA;
int LDU;
int LDVT;
int i,j;
int* dev=NULL;
float* A = NULL;
float* S = NULL;
float* U = NULL;
float* VT = NULL;
char jobu = 'A';
char jobvt = 'A';
/* 引數初始化*/
/*LDA LDU LDVT 都是用來指定矩陣同行兩個相鄰元素之間的物理儲存距離,由於cula lapack 都是按照列儲存矩陣所以此處的LDA LDVT 是矩陣中一列中包含的元素個數, 也就是行數.*/
LDA = m; //m矩陣A的行數 n矩陣A的列數
LDU = m;
LDVT = n;
A = (float*)malloc(m*n*sizeof(float));
S = (float*)malloc(imin(m,n)*sizeof(float));
U = (float*)malloc(LDU*m*sizeof(float));
VT = (float*)malloc(LDVT*n*sizeof(float));
/*初始化cula庫*/
/*culaSgesvd函式引數描述*/
/*Parameters
• jobu
– Type: char
– Direction: Input
Specifies options for computing all or part of the matrix U:
= ‘A’: all M columns of U are returned in array U:
= ‘S’: the first min(m,n) columns of U (the left singular vectors) are returned in the array U;
= ‘O’: the first min(m,n) columns of U (the left singular vectors) are overwritten on the array A;
= ‘N’: no columns of U (no left singular vectors) are computed.
• jobvt
– Type: char
– Direction: Input
Specifies options for computing all or part of the matrix VT :
= ‘A’: all N rows of VT are returned in the array VT;
= ‘S’: the first min(m,n) rows of VT (the right singular vectors) are returned in the array VT;
= ‘O’: the first min(m,n) rows of VT (the right singular vectors) are overwritten on the array A;
= ‘N’: no rows of VT (no right singular vectors) are computed.
JOBVT and JOBU cannot both be ‘O’.
• m
– Type: int
– Direction: Input
The number of rows of the input matrix A. M >= 0.
• n
– Type: int
– Direction: Input
The number of columns of the input matrix A. N >= 0.
• a
– Type: S/D/C/Z Pointer
– Direction: Input/Output
– Dimension: (LDA,N)
On entry, the M-by-N matrix A.
On exit,
if JOBU = ‘O’, A is overwritten with the first min(m,n) columns of U (the left singular vectors, stored
columnwise);
if JOBVT = ‘O’, A is overwritten with the first min(m,n) rows of VT (the right singular vectors,
stored rowwise);
if JOBU != ‘O’ and JOBVT != ‘O’, the contents of A are destroyed.
• lda
– Type: int
– Direction: Input
The leading dimension of the array A. LDA >= max(1,M).
• s
– Type: S/D Pointer
– Direction: Output
– Dimension: (min(M,N))
The singular values of A, sorted so that S(i) >= S(i+1).
• u
– Type: S/D/C/Z Pointer
– Direction: Output
– Dimension: (LDU,UCOL)
(LDU,M) if JOBU = ‘A’ or (LDU,min(M,N)) if JOBU = ‘S’. If JOBU = ‘A’, U contains the M-by-M
orthogonal/unitary matrix U; if JOBU = ‘S’, U contains the first min(m,n) columns of U (the left
singular vectors, stored columnwise); if JOBU = ‘N’ or ‘O’, U is not referenced.
• ldu
– Type: int
– Direction: Input
The leading dimension of the array U. LDU >= 1; if JOBU = ‘S’ or ‘A’, LDU >= M.
• vt
– Type: S/D/C/Z Pointer
– Direction: Output
– Dimension: (LDVT,N)
If JOBVT = ‘A’, VT contains the N-by-N orthogonal/unitary matrix VT ; if JOBVT = ‘S’, VT contains
the first min(m,n) rows of VT (the right singular vectors, stored rowwise); if JOBVT = ‘N’ or ‘O’,
VT is not referenced.
• ldvt
– Type: int
– Direction: Input
The leading dimension of the array VT. LDVT >= 1; if JOBVT = ‘A’, LDVT >= N; if JOBVT = ‘S’,
LDVT >= min(M,N)
*/
status=culaSelectDevice(2); // 選擇執行cula庫的GPU
status = culaInitialize(); //初始化
status = culaSgesvd(jobu, jobvt, m, n, A, LDA, S, U, LDU, VT, LDVT); //開始計算
本示例, 僅僅示範cula庫的一個函式應用, 有需要原始碼和示例程式的, 上cula官方網站下載免費版, 裡面有個小的sdk, 可以參考下http://www.culatools.com
來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/20259129/viewspace-662452/,如需轉載,請註明出處,否則將追究法律責任。
相關文章
- CUDA函式的概念、種類和示例函式
- cuda函式庫介紹函式
- 原始碼開放:WebSocket應用示例原始碼Web
- Kaldi中啟用cuda
- 模擬.NET應用場景,綜合應用反編譯、第三方庫除錯、攔截、一庫多版本相容方案編譯除錯
- 一個簡單的 indexedDB 應用示例Index
- Hive 高階應用開發示例(一)Hive
- 使用Xamarin開發移動應用示例——數獨遊戲(六)使用資料庫遊戲資料庫
- 誰殺死了第三方應用商店?
- Windows下dlib庫python安裝(CUDA)WindowsPython
- Qt 編寫應用支援多語言版本--一個GUI應用示例QTGUI
- Android示例應用:開源框架Glide的使用Android框架IDE
- 【Python】python連結串列應用原始碼示例Python原始碼
- Flutter應用的Clean架構示例專案Flutter架構
- GitHub OAuth 第三方登入示例教程GithubOAuth
- PyTorch和CUDA版本對應關係PyTorch
- 快應用宣佈支援第三方DSL
- 特斯拉第三方應用開發指南(一)
- 使用汽車應用庫構建應用
- B樹概述與簡單應用示例(C#)C#
- 詳解javascript拖拽(二)拖拽的應用及示例JavaScript
- 低功耗4G模組:LCD應用示例
- Django應用建立到啟動的簡單示例Django
- AIGC神器CLIP:技術詳解及應用示例AIGC
- thinkphp___微信第三方應用平臺PHP
- 應用適配資料庫還是資料庫適配應用資料庫
- Nvidia的CUDA庫現在恢復使用了
- 應用示例薈萃 | performance_schema全方位介紹ORM
- 【Python】Python利用有道翻譯開發API應用示例PythonAPI
- Kotlin 布林值教程:深入理解與應用示例Kotlin
- 在 Android 應用中使用 VideoView 播放影片的示例AndroidIDEView
- Tomcat高階配置(應用場景總結及示例)Tomcat
- 如何藉助python第三方庫存取不同應用程式的使用者名稱、密碼Python密碼
- CUDA
- 深入講解Flutter應用模板原始碼:計數器示例Flutter原始碼
- Android 12(S) 圖形顯示系統 - 示例應用(二)Android
- Python的HTTP庫及示例PythonHTTP
- 3.07 EOS資料庫應用資料庫
- 在企業微信自建應用整合及安裝第三方應用的超快速方法