R語言結合H2O做深度學習

std1984發表於2014-08-22
環境: Windows 7,  RStudio


1. 進入RStudio,輸入安裝
 ("h2o", repos=(("", ("repos")))) 

2. 加裝包,啟動h2o本地環境
   (h2o) 
載入需要的程輯包:rjson
載入需要的程輯包:statmod
載入需要的程輯包:tools


----------------------------------------------------------------------


Your next step is to start H2O and get a connection object (named
'localH2O', for example):
    > localH2O = h2o.init()


For H2O package documentation, first call init() and then ask for help:
    > localH2O = h2o.init()
    > ??h2o


To stop H2O you must explicitly call shutdown (either from R, as shown
here, or from the Web UI):
    > h2o.shutdown(localH2O)


After starting H2O, you can use the Web UI at 
For more information visit 


----------------------------------------------------------------------




載入程輯包:‘h2o’


下列物件被遮蔽了from ‘package:base’:


    max, min, sum


Warning messages:
1: 程輯包‘h2o’是用R版本3.0.3 來建造的 
2: 程輯包‘rjson’是用R版本3.0.3 來建造的 
3: 程輯包‘statmod’是用R版本3.0.3 來建造的  


3.  觀看下示例
localH2O = h2o.init(ip = "localhost", port = 54321, startH2O = TRUE,Xmx = '1g')
 
H2O is not running yet, starting it now...
Performing one-time download of h2o.jar from
     
(This could take a few minutes, please be patient...)


Note:  In case of errors look at the following log files:
           C:/TMP/h2o_huangqiang01_started_from_r.out
           C:/TMP/h2o_huangqiang01_started_from_r.err


java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b17)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)


Successfully connected to
R is connected to H2O cluster:
    H2O cluster uptime:        3 seconds 408 milliseconds 
    H2O cluster version:       2.4.3.11 
    H2O cluster name:          H2O_started_from_R 
    H2O cluster total nodes:   1 
    H2O cluster total memory:  0.96 GB 
    H2O cluster total cores:   4 
    H2O cluster healthy:       TRUE 


 (h2o.glm)

4. 訓練minist資料

下載 Train Dataset:
下載 Test Dataset:

res data.frame(Training = NA, Test = NA, Duration = NA)

#載入資料到h2o
 train_h2o  test_h2o C:/Users/jerry/Downloads/mnist_test.csv")


 y_train  y_test
##訓練模型要很長一段時間,多個cpu使用率幾乎是100%,風扇狂響。最後一行有相應的進度條可檢視
model                           y = 1,   # column number for label
                          data = train_h2o,
                          activation = "Tanh",
                          balance_classes = TRUE,
                          hidden = c(100, 100, 100),  ## three hidden layers
                          epochs = 100)

#輸出模型結果
> model
IP Address: localhost 
Port      : 54321 
Parsed Data Key: mnist_train.hex 
Deep Learning Model Key: DeepLearning_9c7831f93efb58b38c3fa08cb17d4e4e


Training classification error: 0
Training mean square error: Inf


Validation classification error: 0
Validation square error: Inf


Confusion matrix:
Reported on mnist_train.hex 
        Predicted
Actual      0    1    2    3    4    5    6    7    8    9 Error
  0      5923    0    0    0    0    0    0    0    0    0     0
  1         0 6742    0    0    0    0    0    0    0    0     0
  2         0    0 5958    0    0    0    0    0    0    0     0
  3         0    0    0 6131    0    0    0    0    0    0     0
  4         0    0    0    0 5842    0    0    0    0    0     0
  5         0    0    0    0    0 5421    0    0    0    0     0
  6         0    0    0    0    0    0 5918    0    0    0     0
  7         0    0    0    0    0    0    0 6265    0    0     0
  8         0    0    0    0    0    0    0    0 5851    0     0
  9         0    0    0    0    0    0    0    0    0 5949     0
  Totals 5923 6742 5958 6131 5842 5421 5918 6265 5851 5949     0


> str(model)


## 評介效能
yhat_train yhat_train
yhat_test yhat_test
檢視前100條預測與實際的資料相比較
> y_test[1:100]
  [1] 7 2 1 0 4 1 4 9 5 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7 2 7 1 2 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5 6 0 4 1 9 5 7 8 9 3 7 4
 [67] 6 4 3 0 7 0 2 9 1 7 3 2 9 7 7 6 2 7 8 4 7 3 6 1 3 6 9 3 1 4 1 7 6 9
Levels: 0 1 2 3 4 5 6 7 8 9

> yhat_test[1:100]
  [1] 7 2 1 0 4 1 8 9 4 9 0 6 9 0 1 5 9 7 3 4 9 6 6 5 4 0 7 4 0 1 3 1 3 4 7 2 7 1 2 1 1 7 4 2 3 5 1 2 4 4 6 3 5 5 6 0 4 1 9 5 7 8 9 3 7 4
 [67] 6 4 3 0 7 0 2 9 1 7 3 2 9 7 7 6 2 7 8 4 7 3 6 1 3 6 9 3 1 4 1 7 6 9
Levels: 0 1 2 3 4 5 6 7 8 9
效果還可以


## 檢視並儲存結果
library(caret)
res[1, 1] res[1, 2] print(res)




(注意:程輯包‘h2o’是用R版本3.0.1 來建造的 , 因此R base應該升級到相應版本, 不然就出現以下報錯:

> library(h2o)
Error in eval(expr, envir, enclos) : 沒有".getNamespace"這個函式
此外: 警告資訊:
程輯包‘h2o’是用R版本3.0.1 來建造的 
Error : 程輯包‘h2o’裡的R寫碼載入失敗
錯誤: ‘h2o’程輯包/名字空間載入失敗

解決方法: 下載 並安裝, 更新其它包的 update.packages(ask=FALSE, checkBuilt = TRUE)
)



來自 “ ITPUB部落格 ” ,連結:http://blog.itpub.net/16582684/viewspace-1255976/,如需轉載,請註明出處,否則將追究法律責任。

相關文章