深度學習TensorFlow基本資料型別及迴歸演算法深入實踐-Coding技術進階實戰

凱新的技術社群發表於2019-01-26

原文網址 : https://juejin.im/post/5c4c28a451882524661d4fb7

秦凱新技術社群推出的《Coding技術進階實戰》系列即將上線，包含語言類精深的用法和技巧，涵蓋 python,Java,Scala,Tensorflow等主流大資料和深度學習技術基礎，敬請期待。為什麼我會寫這樣一個系列，來源於被一位容器雲專家問到如何實現一個執行緒池時，讓我頓感以前研究的Java併發控制相關的理論以及多執行緒併發設計模式忘得九霄雲外，鑑於此，氣憤難平，決定展示個人程式設計魅力。

版權宣告：本套技術專欄是作者（秦凱新）平時工作的總結和昇華，通過從真實商業環境抽取案例進行總結和分享，並給出商業應用的調優建議和叢集環境容量規劃等內容，請持續關注本套部落格。QQ郵箱地址：1120746959@qq.com，如有任何技術交流，可隨時聯絡。

1 TensorFlow基本使用操作

TensorFlow基本模型

  import tensorflow as tf
  a = 3
  # Create a variable.
  w = tf.Variable([[0.5,1.0]])
  x = tf.Variable([[2.0],[1.0]]) 
  
  y = tf.matmul(w, x)  
  
  #variables have to be explicitly initialized before you can run Ops
  init_op = tf.global_variables_initializer()
  with tf.Session() as sess:
      sess.run(init_op)
      print (y.eval())
複製程式碼

TensorFlow基本資料型別

  # float32
  tf.zeros([3, 4], int32) ==> [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
  
  # 'tensor' is [[1, 2, 3], [4, 5, 6]]
  tf.zeros_like(tensor) ==> [[0, 0, 0], [0, 0, 0]]
  tf.ones([2, 3], int32) ==> [[1, 1, 1], [1, 1, 1]]
  
  # 'tensor' is [[1, 2, 3], [4, 5, 6]]
  tf.ones_like(tensor) ==> [[1, 1, 1], [1, 1, 1]]
  
  # Constant 1-D Tensor populated with value list.
  tensor = tf.constant([1, 2, 3, 4, 5, 6, 7]) => [1 2 3 4 5 6 7]
  
  # Constant 2-D tensor populated with scalar value -1.
  tensor = tf.constant(-1.0, shape=[2, 3]) => [[-1. -1. -1.]
                                                [-1. -1. -1.]]
  
  tf.linspace(10.0, 12.0, 3, name="linspace") => [ 10.0  11.0  12.0]
  
  # 'start' is 3
  # 'limit' is 18
  # 'delta' is 3
  tf.range(start, limit, delta) ==> [3, 6, 9, 12, 15]
複製程式碼

random_shuffle運算元及random_normal運算元

  norm = tf.random_normal([2, 3], mean=-1, stddev=4)
  
  # Shuffle the first dimension of a tensor
  c = tf.constant([[1, 2], [3, 4], [5, 6]])
  shuff = tf.random_shuffle(c)
  
  # Each time we run these ops, different results are generated
  sess = tf.Session()
  print (sess.run(norm))
  print (sess.run(shuff))
  
  [[-0.30886292  3.11809683  3.29861784]
   [-7.09597015 -1.89811802  1.75282788]]
  
  [[3 4]
   [5 6]
   [1 2]]
複製程式碼

簡單操作的複雜性

  state = tf.Variable(0)
  new_value = tf.add(state, tf.constant(1))
  update = tf.assign(state, new_value)
  
  with tf.Session() as sess:
      sess.run(tf.global_variables_initializer())
      print(sess.run(state))    
      for _ in range(3):
          sess.run(update)
          print(sess.run(state))
複製程式碼

模型的儲存與載入

  #tf.train.Saver
  w = tf.Variable([[0.5,1.0]])
  x = tf.Variable([[2.0],[1.0]])
  y = tf.matmul(w, x)
  init_op = tf.global_variables_initializer()
  saver = tf.train.Saver()
  with tf.Session() as sess:
      sess.run(init_op)
  # Do some work with the model.
  # Save the variables to disk.
      save_path = saver.save(sess, "C://tensorflow//model//test")
      print ("Model saved in file: ", save_path)
複製程式碼

numpy與TensorFlow互轉

  import numpy as np
  a = np.zeros((3,3))
  ta = tf.convert_to_tensor(a)
  with tf.Session() as sess:
       print(sess.run(ta))
複製程式碼

TensorFlow佔坑操作

  input1 = tf.placeholder(tf.float32)
  input2 = tf.placeholder(tf.float32)
  output = tf.mul(input1, input2)
  with tf.Session() as sess:
      print(sess.run([output], feed_dict={input1:[7.], input2:[2.]}))
複製程式碼

2 TensorFlow線性迴歸實現

numpy線性資料集生成

  import numpy as np
  import tensorflow as tf
  import matplotlib.pyplot as plt
  
  # 隨機生成1000個點，圍繞在y=0.1x+0.3的直線周圍
  num_points = 1000
  vectors_set = []
  for i in range(num_points):
      x1 = np.random.normal(0.0, 0.55)
      y1 = x1 * 0.1 + 0.3 + np.random.normal(0.0, 0.03)
      vectors_set.append([x1, y1])
  
  # 生成一些樣本
  x_data = [v[0] for v in vectors_set]
  y_data = [v[1] for v in vectors_set]
  
  plt.scatter(x_data,y_data,c='r')
  plt.show()
複製程式碼

深度學習TensorFlow基本資料型別及迴歸演算法深入實踐-Coding技術進階實戰

TensorFlow實現線性模型

   生成1維的W矩陣，取值是[-1,1]之間的隨機數
  W = tf.Variable(tf.random_uniform([1], -1.0, 1.0), name='W')
  # 生成1維的b矩陣，初始值是0
  b = tf.Variable(tf.zeros([1]), name='b')
  # 經過計算得出預估值y
  y = W * x_data + b
  
  # Loss: 以預估值y和實際值y_data之間的均方誤差作為損失
  loss = tf.reduce_mean(tf.square(y - y_data), name='loss')
  # 優化器：採用梯度下降法來優化引數（train模組，參數列示學習率）
  optimizer = tf.train.GradientDescentOptimizer(0.5)
  
  # 開始訓練：訓練的過程就是最小化這個誤差值
  train = optimizer.minimize(loss, name='train')
  
  sess = tf.Session()
  
  init = tf.global_variables_initializer()
  sess.run(init)
  
  # 初始化的W和b是多少
  print ("W =", sess.run(W), "b =", sess.run(b), "loss =", sess.run(loss))
  # 執行20次訓練
  for step in range(20):
      sess.run(train)
      # 輸出訓練好的W和b
      print ("W =", sess.run(W), "b =", sess.run(b), "loss =", sess.run(loss))
  writer = tf.train.SummaryWriter("./tmp", sess.graph)
複製程式碼

TensorFlow迭代結果

  W = [ 0.96539688] b = [ 0.] loss = 0.297884
  W = [ 0.71998411] b = [ 0.28193575] loss = 0.112606
  W = [ 0.54009342] b = [ 0.28695393] loss = 0.0572231
  W = [ 0.41235447] b = [ 0.29063231] loss = 0.0292957
  W = [ 0.32164571] b = [ 0.2932443] loss = 0.0152131
  W = [ 0.25723246] b = [ 0.29509908] loss = 0.00811188
  W = [ 0.21149193] b = [ 0.29641619] loss = 0.00453103
  W = [ 0.17901111] b = [ 0.29735151] loss = 0.00272536
  W = [ 0.15594614] b = [ 0.29801565] loss = 0.00181483
  W = [ 0.13956745] b = [ 0.29848731] loss = 0.0013557
  W = [ 0.12793678] b = [ 0.29882219] loss = 0.00112418
  W = [ 0.11967772] b = [ 0.29906002] loss = 0.00100743
  W = [ 0.11381286] b = [ 0.29922891] loss = 0.000948558
  W = [ 0.10964818] b = [ 0.29934883] loss = 0.000918872
  W = [ 0.10669079] b = [ 0.29943398] loss = 0.000903903
  W = [ 0.10459071] b = [ 0.29949448] loss = 0.000896354
  W = [ 0.10309943] b = [ 0.29953739] loss = 0.000892548
  W = [ 0.10204045] b = [ 0.29956791] loss = 0.000890629
  W = [ 0.10128847] b = [ 0.29958954] loss = 0.000889661
  W = [ 0.10075447] b = [ 0.29960492] loss = 0.000889173
  W = [ 0.10037527] b = [ 0.29961586] loss = 0.000888927

  plt.scatter(x_data,y_data,c='r')
  plt.plot(x_data,sess.run(W)*x_data+sess.run(b))
  plt.show()
複製程式碼

版權宣告：本套技術專欄是作者（秦凱新）平時工作的總結和昇華，通過從真實商業環境抽取案例進行總結和分享，並給出商業應用的調優建議和叢集環境容量規劃等內容，請持續關注本套部落格。QQ郵箱地址：1120746959@qq.com，如有任何學術交流，可隨時聯絡。

3 MNIST資料集載入介紹

載入

  import numpy as np
  import tensorflow as tf
  import matplotlib.pyplot as plt
  #from tensorflow.examples.tutorials.mnist import input_data
  import input_data
  
  print ("packs loaded")
  
  print ("Download and Extract MNIST dataset")
  ##使用one_hot 01編碼
  mnist = input_data.read_data_sets('data/', one_hot=True)
  print
  print (" tpye of 'mnist' is %s" % (type(mnist)))
  print (" number of trian data is %d" % (mnist.train.num_examples))
  print (" number of test data is %d" % (mnist.test.num_examples))
  
  Download and Extract MNIST dataset
  Extracting data/train-images-idx3-ubyte.gz
  Extracting data/train-labels-idx1-ubyte.gz
  Extracting data/t10k-images-idx3-ubyte.gz
  Extracting data/t10k-labels-idx1-ubyte.gz
   tpye of 'mnist' is <class 'tensorflow.contrib.learn.python.learn.datasets.base.Datasets'>
   number of trian data is 55000
   number of test data is 10000
複製程式碼

What does the data of MNIST look like?

  print ("What does the data of MNIST look like?")
  trainimg   = mnist.train.images
  trainlabel = mnist.train.labels
  testimg    = mnist.test.images
  testlabel  = mnist.test.labels
  print
  print (" type of 'trainimg' is %s"    % (type(trainimg)))
  print (" type of 'trainlabel' is %s"  % (type(trainlabel)))
  print (" type of 'testimg' is %s"     % (type(testimg)))
  print (" type of 'testlabel' is %s"   % (type(testlabel)))
  print (" shape of 'trainimg' is %s"   % (trainimg.shape,))
  print (" shape of 'trainlabel' is %s" % (trainlabel.shape,))
  print (" shape of 'testimg' is %s"    % (testimg.shape,))
  print (" shape of 'testlabel' is %s"  % (testlabel.shape,))


  What does the data of MNIST look like?
   type of 'trainimg' is <class 'numpy.ndarray'>
   type of 'trainlabel' is <class 'numpy.ndarray'>
   type of 'testimg' is <class 'numpy.ndarray'>
   type of 'testlabel' is <class 'numpy.ndarray'>
   shape of 'trainimg' is (55000, 784)
   shape of 'trainlabel' is (55000, 10)
   shape of 'testimg' is (10000, 784)
   shape of 'testlabel' is (10000, 10)
複製程式碼

How does the training data look like?

  # How does the training data look like?
  print ("How does the training data look like?")
  nsample = 5
  randidx = np.random.randint(trainimg.shape[0], size=nsample)
  
  for i in randidx:
      curr_img   = np.reshape(trainimg[i, :], (28, 28)) # 28 by 28 matrix 
      curr_label = np.argmax(trainlabel[i, :] ) # Label
      plt.matshow(curr_img, cmap=plt.get_cmap('gray'))
      plt.title("" + str(i) + "th Training Data " 
                + "Label is " + str(curr_label))
      print ("" + str(i) + "th Training Data " 
             + "Label is " + str(curr_label))
      plt.show()
複製程式碼

Batch Learning?

 print ("Batch Learning? ")
 batch_size = 100
 batch_xs, batch_ys = mnist.train.next_batch(batch_size)
 print ("type of 'batch_xs' is %s" % (type(batch_xs)))
 print ("type of 'batch_ys' is %s" % (type(batch_ys)))
 print ("shape of 'batch_xs' is %s" % (batch_xs.shape,))
 print ("shape of 'batch_ys' is %s" % (batch_ys.shape,))

 Batch Learning? 
 type of 'batch_xs' is <class 'numpy.ndarray'>
 type of 'batch_ys' is <class 'numpy.ndarray'>
 shape of 'batch_xs' is (100, 784)
 shape of 'batch_ys' is (100, 10)
複製程式碼

4 MNIST資料集邏輯迴歸測試

tensorflow的tf.reduce_mean函式

  m1 = tf.reduce_mean(x, axis=0)
  結果為：[1.5, 1.5]
複製程式碼

tensorflow的argmaxtensorflow的 sess = tf.InteractiveSession()

  arr = np.array([[31, 23,  4, 24, 27, 34],
                  [18,  3, 25,  0,  6, 35],
                  [28, 14, 33, 22, 20,  8],
                  [13, 30, 21, 19,  7,  9],
                  [16,  1, 26, 32,  2, 29],
                  [17, 12,  5, 11, 10, 15]])
  
  #列印加上eval 
  ## 矩陣的維度 2
  #tf.rank(arr).eval()
  
  ## 矩陣行和列 [6,6]
  #tf.shape(arr).eval()
  
  # 引數0表示維度，按照列。  表示最每列最大值的索引 [0,3,2,4,0,1]
  #tf.argmax(arr, 0).eval()
  # 0 -> 31 (arr[0, 0])
  # 3 -> 30 (arr[3, 1])
  # 2 -> 33 (arr[2, 2])
  tf.argmax(arr, 1).eval()
  # 5 -> 34 (arr[0, 5])
  # 5 -> 35 (arr[1, 5])
  # 2 -> 33 (arr[2, 2])

  array([5, 5, 2, 1, 3, 0], dtype=int64)
複製程式碼

載入資料集

  import numpy as np
  import tensorflow as tf
  import matplotlib.pyplot as plt
  import input_data
  
  mnist      = input_data.read_data_sets('data/', one_hot=True)
  trainimg   = mnist.train.images
  trainlabel = mnist.train.labels
  testimg    = mnist.test.images
  testlabel  = mnist.test.labels
  print ("MNIST loaded")
  
  Extracting data/train-images-idx3-ubyte.gz
  Extracting data/train-labels-idx1-ubyte.gz
  Extracting data/t10k-images-idx3-ubyte.gz
  Extracting data/t10k-labels-idx1-ubyte.gz
  MNIST loaded
  
  print (trainimg.shape)
  print (trainlabel.shape)
  print (testimg.shape)
  print (testlabel.shape)
  #print (trainimg)
  print (trainlabel[0])
  
  (55000, 784)
  (55000, 10)
  (10000, 784)
  (10000, 10)
  [ 0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
複製程式碼

TF邏輯迴歸模型構建

  # 先放坑（每一行是一個樣本）
  x = tf.placeholder("float", [None, 784])
  # 總共10位 [ 0.  0.  0.  0.  0.  0.  0.  1.  0.  0.]
  y = tf.placeholder("float", [None, 10])  # None is for infinite 
  
  #10分類任務 784輸入，10代表輸出
  W = tf.Variable(tf.zeros([784, 10]))
  
  # 10代表輸出
  b = tf.Variable(tf.zeros([10]))
  
  # LOGISTIC REGRESSION MODEL（輸出為10）
  actv = tf.nn.softmax(tf.matmul(x, W) + b) 
  
  # COST FUNCTION（損失函式）
  cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(actv), reduction_indices=1)) 
  
  # OPTIMIZER
  learning_rate = 0.01
  optm = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
複製程式碼

TF模型訓練

  ##迭代次數
  training_epochs = 50
  每次迭代多少樣本
  batch_size      = 100
  display_step    = 5
  # SESSION
  sess = tf.Session()
  sess.run(init)
  # MINI-BATCH LEARNING
  for epoch in range(training_epochs):
      avg_cost = 0.
      num_batch = int(mnist.train.num_examples/batch_size)
      for i in range(num_batch): 
          batch_xs, batch_ys = mnist.train.next_batch(batch_size)
          sess.run(optm, feed_dict={x: batch_xs, y: batch_ys})
          feeds = {x: batch_xs, y: batch_ys}
          avg_cost += sess.run(cost, feed_dict=feeds)/num_batch
      # DISPLAY
      if epoch % display_step == 0:
          feeds_train = {x: batch_xs, y: batch_ys}
          feeds_test = {x: mnist.test.images, y: mnist.test.labels}
          train_acc = sess.run(accr, feed_dict=feeds_train)
          test_acc = sess.run(accr, feed_dict=feeds_test)
          print ("Epoch: %03d/%03d cost: %.9f train_acc: %.3f test_acc: %.3f" 
                 % (epoch, training_epochs, avg_cost, train_acc, test_acc))
  print ("DONE")  
  
  Epoch: 000/050 cost: 1.177906594 train_acc: 0.840 test_acc: 0.855
  Epoch: 005/050 cost: 0.440515266 train_acc: 0.860 test_acc: 0.895
  Epoch: 010/050 cost: 0.382895913 train_acc: 0.910 test_acc: 0.905
  Epoch: 015/050 cost: 0.356607343 train_acc: 0.870 test_acc: 0.909
  Epoch: 020/050 cost: 0.341326642 train_acc: 0.860 test_acc: 0.912
  Epoch: 025/050 cost: 0.330556413 train_acc: 0.910 test_acc: 0.913
  Epoch: 030/050 cost: 0.321508561 train_acc: 0.840 test_acc: 0.916
  Epoch: 035/050 cost: 0.314936944 train_acc: 0.940 test_acc: 0.917
  Epoch: 040/050 cost: 0.309805418 train_acc: 0.940 test_acc: 0.918
  Epoch: 045/050 cost: 0.305343132 train_acc: 0.960 test_acc: 0.918
  DONE
複製程式碼

5 總結

通過簡單的案例，真正明白TensorFlow設計思想，才是本文的目的。

版權宣告：本套技術專欄是作者（秦凱新）平時工作的總結和昇華，通過從真實商業環境抽取案例進行總結和分享，並給出商業應用的調優建議和叢集環境容量規劃等內容，請持續關注本套部落格。QQ郵箱地址：1120746959@qq.com，如有任何學術交流，可隨時聯絡。

秦凱新於深圳 201812092128

Scala隱式轉換理論及進階實踐-Coding技術進階實戰
2019-01-10
【深度學習基礎-12】多元迴歸分析基礎及進階-python程式碼實現
2019-01-14
深度學習Python
深度學習Tensorflow實戰，新課進行曲！
2018-03-26
深度學習
windows10 tensorflow（二）原理實戰之迴歸分析，深度
2021-09-09
Windows
深度學習：TensorFlow入門實戰
2021-09-16
深度學習
《深度學習Python》核心技術實戰
2021-04-03
深度學習Python
深度學習DeepLearning核心技術實戰
2021-03-21
深度學習
Python TensorFlow深度學習迴歸程式碼：DNNRegressor
2023-02-02
Python深度學習DNN
深度學習、強化學習核心技術實戰
2021-03-21
深度學習強化學習
基於TensorFlow的深度學習實戰
2018-04-25
深度學習
技術集錦 | 大資料雲原生技術實戰及最佳實踐系列
2022-02-11
大資料
深度強化學習核心技術實戰
2021-03-20
強化學習
深入 Python 資料分析：高階技術與實戰應用
2024-10-04
Python
機器學習-邏輯迴歸：從技術原理到案例實戰
2023-12-06
機器學習邏輯迴歸
【深度學習 01】線性迴歸+PyTorch實現
2022-03-27
深度學習PyTorch
《深度學習之TensorFlow：入門、原理與進階實戰》PDF+原始碼+李金洪
2019-01-28
深度學習原始碼
機器學習實戰之Logistic迴歸
2018-06-25
機器學習
基於Scikit-learn迴歸基礎問題及TPR及ROC指標相關技術實踐-大資料ML樣本集案例實戰
2019-02-17
指標大資料
機器學習實戰專案-預測數值型迴歸
2019-04-08
機器學習
《深度學習DeepLearning核心技術實戰培訓班》
2021-01-04
深度學習
深度學習技術實踐與圖神經網路新技術
2022-09-17
深度學習神經網路
深入學習JavaScript資料型別
2020-07-10
JavaScript資料型別
基本資料型別，for迴圈
2019-03-04
資料型別
【TensorFlow篇】--Tensorflow框架初始，實現機器學習中多元線性迴歸
2018-03-27
框架機器學習
C語言進階——基本資料型別01
2018-06-18
C語言資料型別
機器學習實戰（一）—— 線性迴歸
2020-12-01
機器學習
【深度學習-基於Tensorflow的實戰】公開課實況
2018-12-21
深度學習
TensorFlow實現線性迴歸
2019-06-05
Java技術分享：Java基本資料型別
2021-06-15
Java資料型別
推薦閱讀《Tensorflow：實戰Google深度學習框架》
2019-12-17
Go深度學習框架
《Tensorflow：實戰Google深度學習框架》圖書推薦
2018-03-08
Go深度學習框架
Python學習教程：基本資料型別
2020-06-01
Python資料型別
《精通資料科學：從線性迴歸到深度學習》
2019-12-17
資料科學深度學習
學習變數的目的及基本資料型別介紹
2024-03-28
變數資料型別
Python3學習（基本資料型別-集合-字典-基本資料型別總結）
2018-10-25
Python資料型別
深度學習之tensorflow2實戰：多輸出模型
2022-11-23
深度學習模型
深入解析Go非型別安全指標：技術全解與最佳實踐
2023-10-13
Go型別指標
Python技術棧與Spark交叉資料分析雙向整合進階實戰–大資料ML樣本集案例實戰
2019-03-01
PythonSpark大資料