tensorflow教程:tf.contrib.rnn.DropoutWrapper

ChasingdreamLY發表於2018-11-25

tf.contrib.rnn.DropoutWrapper 
Defined in tensorflow/python/ops/rnn_cell_impl.py.

__init__(
    cell,
    input_keep_prob=1.0,
    output_keep_prob=1.0,
    state_keep_prob=1.0,
    variational_recurrent=False,
    input_size=None,
    dtype=None,
    seed=None,
    dropout_state_filter_visitor=None
)


 

Args:

cell:  an RNNCell, a projection to output_size is added to it.
input_keep_prob:  unit Tensor or float between 0 and 1, input keep probability; if it is constant and 1, no input dropout will be added.
output_keep_prob:  unit Tensor or float between 0 and 1, output keep probability; if it is constant and 1, no output dropout will be added.
state_keep_prob:  unit Tensor or float between 0 and 1, output keep probability; if it is constant and 1, no output dropout will be added. State dropout is performed on the outgoing states of the cell. Note the state components to which dropout is applied when state_keep_prob is in (0, 1) are also determined by the argument dropout_state_filter_visitor (e.g. by default dropout is never applied to the c component of an LSTMStateTuple).
variational_recurrent:  Python bool. If True, then the same dropout pattern is applied across all time steps per run call. If this parameter is set, input_size must be provided.
input_size:  (optional) (possibly nested tuple of) TensorShape objects containing the depth(s) of the input tensors expected to be passed in to the DropoutWrapper. Required and used iff variational_recurrent = True and input_keep_prob < 1.
dtype:  (optional) The dtype of the input, state, and output tensors. Required and used iff variational_recurrent = True.
seed:  (optional) integer, the randomness seed.
dropout_state_filter_visitor:  (optional), default: (see below). Function that takes any hierarchical level of the state and returns a scalar or depth=1 structure of Python booleans describing which terms in the state should be dropped out. In addition, if the function returns True, dropout is applied across this sublevel. If the function returns False, dropout is not applied across this entire sublevel. Default behavior: perform dropout on all terms except the memory (c) state of LSTMCellState objects, and don't try to apply dropout to TensorArray objects: def dropout_state_filter_visitor(s):  if isinstance(s, LSTMCellState): # Never perform dropout on the c state. return LSTMCellState(c=False, h=True) elif isinstance(s, TensorArray): return False return True



所謂dropout,就是指網路中每個單元在每次有資料流入時以一定的概率(keep prob)正常工作,否則輸出0值。這是是一種有效的正則化方法,可以有效防止過擬合。在rnn中使用dropout的方法和cnn不同,推薦大家去把recurrent neural network regularization看一遍。 
在rnn中進行dropout時,對於rnn的部分不進行dropout,也就是說從t-1時候的狀態傳遞到t時刻進行計算時,這個中間不進行memory的dropout;僅在同一個t時刻中,多層cell之間傳遞資訊的時候進行dropout,如下圖所示 
 
上圖中,t-2時刻的輸入xt−2首先傳入第一層cell,這個過程有dropout,但是從t−2時刻的第一層cell傳到t−1,t,t+1的第一層cell這個中間都不進行dropout。再從t+1時候的第一層cell向同一時刻內後續的cell傳遞時,這之間又有dropout了。

在使用tf.nn.rnn_cell.DropoutWrapper時,同樣有一些引數,例如input_keep_prob,output_keep_prob等,分別控制輸入和輸出的dropout概率,很好理解。 
可以從官方文件中看到,它有input_keep_prob和output_keep_prob,也就是說裹上這個DropoutWrapper之後,如果我希望是input傳入這個cell時dropout掉一部分input資訊的話,就設定input_keep_prob,那麼傳入到cell的就是部分input;如果我希望這個cell的output只部分作為下一層cell的input的話,就定義output_keep_prob。 
備註:Dropout只能是層與層之間(輸入層與LSTM1層、LSTM1層與LSTM2層)的Dropout;同一個層裡面,T時刻與T+1時刻是不會Dropout的。
 

相關文章