《T-GCN: A Temporal Graph Convolutional Network for Trafﬁc Prediction》程式碼解讀

那不太可能發表於2020-08-12

原文網址 : https://www.cnblogs.com/missouter/p/13488342.html

部落格原作者Missouter，部落格連結https://www.cnblogs.com/missouter/，歡迎交流。

解讀了一下這篇論文github上關於T-GCN的程式碼，主要分為main檔案與TGCN檔案兩部分，後續有空將會更新其他部分作為baseline程式碼的解讀（鴿）。

1、main.py

# -*- coding: utf-8 -*-
import pickle as pkl
import tensorflow as tf
import pandas as pd
import numpy as np
import math
import os
import numpy.linalg as la
from input_data import preprocess_data,load_sz_data,load_los_data
from tgcn import tgcnCell
#from gru import GRUCell 

from visualization import plot_result,plot_error
from sklearn.metrics import mean_squared_error,mean_absolute_error
#import matplotlib.pyplot as plt
import time

time_start = time.time()
###### Settings ######
flags = tf.app.flags
FLAGS = flags.FLAGS
flags.DEFINE_float('learning_rate', 0.001, 'Initial learning rate.')
flags.DEFINE_integer('training_epoch', 1, 'Number of epochs to train.')
flags.DEFINE_integer('gru_units', 64, 'hidden units of gru.')
flags.DEFINE_integer('seq_len',12 , '  time length of inputs.')
flags.DEFINE_integer('pre_len', 3, 'time length of prediction.')
flags.DEFINE_float('train_rate', 0.8, 'rate of training set.')
flags.DEFINE_integer('batch_size', 32, 'batch size.')
flags.DEFINE_string('dataset', 'los', 'sz or los.')
flags.DEFINE_string('model_name', 'tgcn', 'tgcn')
model_name = FLAGS.model_name
data_name = FLAGS.dataset
train_rate =  FLAGS.train_rate
seq_len = FLAGS.seq_len
output_dim = pre_len = FLAGS.pre_len
batch_size = FLAGS.batch_size
lr = FLAGS.learning_rate
training_epoch = FLAGS.training_epoch
gru_units = FLAGS.gru_units

開頭部分用於設定訓練基本引數；使用flag對引數進行設定與說明。

if data_name == 'sz':
    data, adj = load_sz_data('sz')
if data_name == 'los':
    data, adj = load_los_data('los')

time_len = data.shape[0]
num_nodes = data.shape[1]
data1 =np.mat(data,dtype=np.float32)

#### normalization
max_value = np.max(data1)
data1  = data1/max_value
trainX, trainY, testX, testY = preprocess_data(data1, time_len, train_rate, seq_len, pre_len)

totalbatch = int(trainX.shape[0]/batch_size)
training_data_count = len(trainX)

這部分匯入資料集並對資料進行歸一化，input_data檔案中匯入函式如下：

def load_sz_data(dataset):
    sz_adj = pd.read_csv(r'data/sz_adj.csv',header=None)
    adj = np.mat(sz_adj)
    sz_tf = pd.read_csv(r'data/sz_speed.csv')
    return sz_tf, adj

def load_los_data(dataset):
    los_adj = pd.read_csv(r'data/los_adj.csv',header=None)
    adj = np.mat(los_adj)
    los_tf = pd.read_csv(r'data/los_speed.csv')
    return los_tf, adj

其中preprocess_data函式根據main函式開頭設定的訓練集、測試集比例對資料集進行分割：

def preprocess_data(data, time_len, rate, seq_len, pre_len):
    train_size = int(time_len * rate)
    train_data = data[0:train_size]
    test_data = data[train_size:time_len]
    
    trainX, trainY, testX, testY = [], [], [], []
    for i in range(len(train_data) - seq_len - pre_len):
        a = train_data[i: i + seq_len + pre_len]
        trainX.append(a[0 : seq_len])
        trainY.append(a[seq_len : seq_len + pre_len])
    for i in range(len(test_data) - seq_len -pre_len):
        b = test_data[i: i + seq_len + pre_len]
        testX.append(b[0 : seq_len])
        testY.append(b[seq_len : seq_len + pre_len])
      
    trainX1 = np.array(trainX)
    trainY1 = np.array(trainY)
    testX1 = np.array(testX)
    testY1 = np.array(testY)
    return trainX1, trainY1, testX1, testY1

接著定義了TGCN函式：

def TGCN(_X, _weights, _biases):
    ###
    cell_1 = tgcnCell(gru_units, adj, num_nodes=num_nodes)
    cell = tf.nn.rnn_cell.MultiRNNCell([cell_1], state_is_tuple=True)
    _X = tf.unstack(_X, axis=1)
    outputs, states = tf.nn.static_rnn(cell, _X, dtype=tf.float32)
    m = []
    for i in outputs:
        o = tf.reshape(i,shape=[-1,num_nodes,gru_units])
        o = tf.reshape(o,shape=[-1,gru_units])
        m.append(o)
    last_output = m[-1]
    output = tf.matmul(last_output, _weights['out']) + _biases['out']
    output = tf.reshape(output,shape=[-1,num_nodes,pre_len])
    output = tf.transpose(output, perm=[0,2,1])
    output = tf.reshape(output, shape=[-1,num_nodes])
    return output, m, states

函式開頭首先引入了TGCN的計算單元，tgcnCell的解讀將在後文進行；使用tf.nn.rnn_cell.MultiRNNCell實現多層神經網路；對輸入資料進行處理，建立由RNNCell指定的迴圈神經網路。接著對每個迴圈神經網路的輸出進行處理，首先重塑結果張量，tf.reshape中引數-1表示計算該維度的大小,以使總大小保持不變；第二維為點的數量，第三維為GRU單元的數量，再緊接上一層張量重塑的結果繼續進行重塑，得到由長度為GRU數量列表組成的列表，使用tf.matmul將輸出矩陣乘以權重矩陣，biases為偏差，接著重塑輸出張量為第二維為資料點的數量，第三維為預測長度的矩陣，再置換輸出矩陣,使用transpose按照[0,2,1]重新排列尺寸，進一步重塑為由資料點數目長度列表組成的列表，得到最終輸出結果。

緊接著下一段使用佔位符定義輸入與標籤，隨機初始化權重與偏差：

inputs = tf.placeholder(tf.float32, shape=[None, seq_len, num_nodes])
labels = tf.placeholder(tf.float32, shape=[None, pre_len, num_nodes])

weights = {
    'out': tf.Variable(tf.random_normal([gru_units, pre_len], mean=1.0), name='weight_o')}
biases = {
    'out': tf.Variable(tf.random_normal([pre_len]),name='bias_o')}

呼叫TGCN模型，得到最終輸出、每層輸出與最終狀態：

if model_name == 'tgcn':
    pred,ttts,ttto = TGCN(inputs, weights, biases)

y_pred = pred

定義優化器，根據訓練資料方差設定偏差：

lambda_loss = 0.0015
Lreg = lambda_loss * sum(tf.nn.l2_loss(tf_var) for tf_var in tf.trainable_variables())
label = tf.reshape(labels, [-1,num_nodes])

定義損失函式：

loss = tf.reduce_mean(tf.nn.l2_loss(y_pred-label) + Lreg)

對應論文公式（詳見上篇部落格）：

定義均方根誤差：

error = tf.sqrt(tf.reduce_mean(tf.square(y_pred-label)))

定義優化迭代器：

optimizer = tf.train.AdamOptimizer(lr).minimize(loss)

對迭代訓練過程進行初始化：

variables = tf.global_variables()
saver = tf.train.Saver(tf.global_variables()) #
#sess = tf.Session()
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
sess.run(tf.global_variables_initializer())
out = 'out/%s'%(model_name)
#out = 'out/%s_%s'%(model_name,'perturbation')
path1 = '%s_%s_lr%r_batch%r_unit%r_seq%r_pre%r_epoch%r'%(model_name,data_name,lr,batch_size,gru_units,seq_len,pre_len,training_epoch)
path = os.path.join(out,path1)
if not os.path.exists(path):
    os.makedirs(path)

其中global_variables用於獲取程式中的變數，配合train.Saver將訓練好的模型引數儲存起來，以便以後進行驗證或測試。tf.GPUOptions用於限制GPU資源的使用，不過為什麼要限制使用三分之一的視訊記憶體尚不清楚，算訓練小技巧嘛？初始化模型的引數後設定輸出路徑與檔名，不詳細討論。

檔案中的評估模組定義了論文實驗部分的指標：均方根誤差、平均絕對誤差、準確率、確定係數與可方差值。

def evaluation(a,b):
    rmse = math.sqrt(mean_squared_error(a,b))
    mae = mean_absolute_error(a, b)
    F_norm = la.norm(a-b,'fro')/la.norm(a,'fro')
    r2 = 1-((a-b)**2).sum()/((a-a.mean())**2).sum()
    var = 1-(np.var(a-b))/np.var(a)
    return rmse, mae, 1-F_norm, r2, var

接下來就是常見的訓練部分：

for epoch in range(training_epoch):
    for m in range(totalbatch):
        mini_batch = trainX[m * batch_size : (m+1) * batch_size]
        mini_label = trainY[m * batch_size : (m+1) * batch_size]
        _, loss1, rmse1, train_output = sess.run([optimizer, loss, error, y_pred],
                                                 feed_dict = {inputs:mini_batch, labels:mini_label})
        batch_loss.append(loss1)
        batch_rmse.append(rmse1 * max_value)

     # Test completely at every epoch
    loss2, rmse2, test_output = sess.run([loss, error, y_pred],
                                         feed_dict = {inputs:testX, labels:testY})
    test_label = np.reshape(testY,[-1,num_nodes])
    rmse, mae, acc, r2_score, var_score = evaluation(test_label, test_output)
    test_label1 = test_label * max_value#反歸一化
    test_output1 = test_output * max_value
    test_loss.append(loss2)
    test_rmse.append(rmse * max_value)
    test_mae.append(mae * max_value)
    test_acc.append(acc)
    test_r2.append(r2_score)
    test_var.append(var_score)
    test_pred.append(test_output1)
    
    print('Iter:{}'.format(epoch),
          'train_rmse:{:.4}'.format(batch_rmse[-1]),
          'test_loss:{:.4}'.format(loss2),
          'test_rmse:{:.4}'.format(rmse),
          'test_acc:{:.4}'.format(acc))
    if (epoch % 500 == 0):        
        saver.save(sess, path+'/model_100/TGCN_pre_%r'%epoch, global_step = epoch)
        
time_end = time.time()
print(time_end-time_start,'s')

附帶對每個週期訓練結果的測試、對結果的反歸一化，訓練設定為每訓練500層儲存一次模型，並對訓練得到的引數指標進行列印與儲存。程式碼最後還給出了視覺化資料指標的方法，即將資料指標寫入csv檔案中：

b = int(len(batch_rmse)/totalbatch)
batch_rmse1 = [i for i in batch_rmse]
train_rmse = [(sum(batch_rmse1[i*totalbatch:(i+1)*totalbatch])/totalbatch) for i in range(b)]
batch_loss1 = [i for i in batch_loss]
train_loss = [(sum(batch_loss1[i*totalbatch:(i+1)*totalbatch])/totalbatch) for i in range(b)]

index = test_rmse.index(np.min(test_rmse))
test_result = test_pred[index]
var = pd.DataFrame(test_result)
var.to_csv(path+'/test_result.csv',index = False,header = False)
#plot_result(test_result,test_label1,path)
#plot_error(train_rmse,train_loss,test_rmse,test_acc,test_mae,path)

print('min_rmse:%r'%(np.min(test_rmse)),
      'min_mae:%r'%(test_mae[index]),
      'max_acc:%r'%(test_acc[index]),
      'r2:%r'%(test_r2[index]),
      'var:%r'%test_var[index])

至此對論文對應程式碼main檔案的解讀就結束了。

2、tgcn.py

此檔案只定義了一個TGCN計算單元的類，初始化部分不作詳談：

# -*- coding: utf-8 -*-

#import numpy as np
import tensorflow as tf
from tensorflow.contrib.rnn import RNNCell
from utils import calculate_laplacian

class tgcnCell(RNNCell):
    """Temporal Graph Convolutional Network """

    def call(self, inputs, **kwargs):
        pass
-
    def __init__(self, num_units, adj, num_nodes, input_size=None,
                 act=tf.nn.tanh, reuse=None):


        super(tgcnCell, self).__init__(_reuse=reuse)
        self._act = act
        self._nodes = num_nodes
        self._units = num_units
        self._adj = []
        self._adj.append(calculate_laplacian(adj))


    @property
    def state_size(self):
        return self._nodes * self._units

    @property
    def output_size(self):
        return self._units

重點之一在於對GRU單元的定義：

def __call__(self, inputs, state, scope=None):
        with tf.variable_scope(scope or "tgcn"):
            with tf.variable_scope("gates"):  
                value = tf.nn.sigmoid(
                    self._gc(inputs, state, 2 * self._units, bias=1.0, scope=scope))
                r, u = tf.split(value=value, num_or_size_splits=2, axis=1)
            with tf.variable_scope("candidate"):
                r_state = r * state
                c = self._act(self._gc(inputs, r_state, self._units, scope=scope))
            new_h = u * state + (1 - u) * c
        return new_h, new_h

程式碼還原論文中tgcn單元的計算過程（詳見上一篇部落格）：

引數中state對應論文中上一時刻的狀態，即h_t-1。variable_scope使得多個變數得以有相同的命名；上述程式碼中tf.nn.sigmoid語句為啟用函式，用於進行圖卷積GC；tf.split語句用於

分割卷積後的張量，重置門r用於控制先前時刻狀態資訊的度量，上傳門u用於控制上傳到下一狀態的資訊度量； candidate部分的c對應公式：

函式最後返回最新狀態h_t。

圖卷積過程最後被定義：

    def _gc(self, inputs, state, output_size, bias=0.0, scope=None):
        ## inputs:(-1,num_nodes)
        
        inputs = tf.expand_dims(inputs, 2)
        ## state:(batch,num_node,gru_units)
        state = tf.reshape(state, (-1, self._nodes, self._units))
        ## concat
        x_s = tf.concat([inputs, state], axis=2)
        input_size = x_s.get_shape()[2].value
        ## (num_node,input_size,-1)
        x0 = tf.transpose(x_s, perm=[1, 2, 0])  
        x0 = tf.reshape(x0, shape=[self._nodes, -1])
  
        scope = tf.get_variable_scope()
        with tf.variable_scope(scope):
            for m in self._adj:
                x1 = tf.sparse_tensor_dense_matmul(m, x0)
#                print(x1)
            x = tf.reshape(x1, shape=[self._nodes, input_size,-1])
            x = tf.transpose(x,perm=[2,0,1])
            x = tf.reshape(x, shape=[-1, input_size])
            weights = tf.get_variable(
                'weights', [input_size, output_size], initializer=tf.contrib.layers.xavier_initializer())
            x = tf.matmul(x, weights)  # (batch_size * self._nodes, output_size)
            biases = tf.get_variable(
                "biases", [output_size], initializer=tf.constant_initializer(bias, dtype=tf.float32))
            x = tf.nn.bias_add(x, biases)
            x = tf.reshape(x, shape=[-1, self._nodes, output_size])
            x = tf.reshape(x, shape=[-1, self._nodes * output_size])
        return x

函式開頭對特徵矩陣進行構建：使用expand_dims增加輸入維度，再使用將當前狀態轉化為第二維為資料點數量，第三維為gru單元數量的列表，使用concat在第二個維度拼接張量，最後得到一個長度為資料點數量的列表。get_variable_scope獲取變數後，將得到的特徵矩陣與鄰接矩陣相乘。在tf.nn.bias_add處啟用得到兩層GCN，對應公式：

最終返回輸出值x。此函式經歷了很多張量的形式轉換，對應論文空間關係建模過程。

關於論文中TGCN部分程式碼的解讀結束了，模組化的程式設計對於學習實驗手法有很多值得學習的地方，對於TGCN本身的實現、涉及張量的處理變換也有很多可以借鑑的地方。

《T-GCN: A Temporal Graph Convolutional Network for Trafﬁc Prediction》 程式碼解讀

相關文章

《T-GCN: A Temporal Graph Convolutional Network for Trafﬁc Prediction》程式碼解讀