Tensorflow 學習

段智華發表於2018-07-18

原文網址 : https://blog.csdn.net/duan_zhihua/article/details/81104259

Tensorflow 學習

Tensorflow 的基本架構：

資料格式(Protobuf)定義

tensor.proto原始碼：

syntax = "proto3";

package tensorflow;
option cc_enable_arenas = true;
option java_outer_classname = "TensorProtos";
option java_multiple_files = true;
option java_package = "org.tensorflow.framework";
option go_package = "github.com/tensorflow/tensorflow/tensorflow/go/core/framework";
import "tensorflow/core/framework/resource_handle.proto";
import "tensorflow/core/framework/tensor_shape.proto";
import "tensorflow/core/framework/types.proto";

// Protocol buffer representing a tensor.
message TensorProto {
  DataType dtype = 1;

  // Shape of the tensor.  TODO(touts): sort out the 0-rank issues.
  TensorShapeProto tensor_shape = 2;

  // Only one of the representations below is set, one of "tensor_contents" and
  // the "xxx_val" attributes.  We are not using oneof because as oneofs cannot
  // contain repeated fields it would require another extra set of messages.

  // Version number.
  //
  // In version 0, if the "repeated xxx" representations contain only one
  // element, that element is repeated to fill the shape.  This makes it easy
  // to represent a constant Tensor with a single value.
  int32 version_number = 3;

  // Serialized raw tensor content from either Tensor::AsProtoTensorContent or
  // memcpy in tensorflow::grpc::EncodeTensorToByteBuffer. This representation
  // can be used for all tensor types. The purpose of this representation is to
  // reduce serialization overhead during RPC call by avoiding serialization of
  // many repeated small items.
  bytes tensor_content = 4;

  // Type specific representations that make it easy to create tensor protos in
  // all languages.  Only the representation corresponding to "dtype" can
  // be set.  The values hold the flattened representation of the tensor in
  // row major order.

  // DT_HALF, DT_BFLOAT16. Note that since protobuf has no int16 type, we'll
  // have some pointless zero padding for each value here.
  repeated int32 half_val = 13 [packed = true];

  // DT_FLOAT.
  repeated float float_val = 5 [packed = true];

  // DT_DOUBLE.
  repeated double double_val = 6 [packed = true];

  // DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
  repeated int32 int_val = 7 [packed = true];

  // DT_STRING
  repeated bytes string_val = 8;

  // DT_COMPLEX64. scomplex_val(2*i) and scomplex_val(2*i+1) are real
  // and imaginary parts of i-th single precision complex.
  repeated float scomplex_val = 9 [packed = true];

  // DT_INT64
  repeated int64 int64_val = 10 [packed = true];

  // DT_BOOL
  repeated bool bool_val = 11 [packed = true];

  // DT_COMPLEX128. dcomplex_val(2*i) and dcomplex_val(2*i+1) are real
  // and imaginary parts of i-th double precision complex.
  repeated double dcomplex_val = 12 [packed = true];

  // DT_RESOURCE
  repeated ResourceHandleProto resource_handle_val = 14;

  // DT_VARIANT
  repeated VariantTensorDataProto variant_val = 15;

  // DT_UINT32
  repeated uint32 uint32_val = 16 [packed = true];

  // DT_UINT64
  repeated uint64 uint64_val = 17 [packed = true];
};

// Protocol buffer representing the serialization format of DT_VARIANT tensors.
message VariantTensorDataProto {
  // Name of the type of objects being serialized.
  string type_name = 1;
  // Portions of the object that are not Tensors.
  bytes metadata = 2;
  // Tensors contained within objects being serialized.
  repeated TensorProto tensors = 3;
}

graph.proto

syntax = "proto3";

package tensorflow;
option cc_enable_arenas = true;
option java_outer_classname = "GraphProtos";
option java_multiple_files = true;
option java_package = "org.tensorflow.framework";
option go_package = "github.com/tensorflow/tensorflow/tensorflow/go/core/framework";
import "tensorflow/core/framework/node_def.proto";
import "tensorflow/core/framework/function.proto";
import "tensorflow/core/framework/versions.proto";

// Represents the graph of operations
message GraphDef {
  repeated NodeDef node = 1;

  // Compatibility versions of the graph.  See core/public/version.h for version
  // history.  The GraphDef version is distinct from the TensorFlow version, and
  // each release of TensorFlow will support a range of GraphDef versions.
  VersionDef versions = 4;

  // Deprecated single version field; use versions above instead.  Since all
  // GraphDef changes before "versions" was introduced were forward
  // compatible, this field is entirely ignored.
  int32 version = 3 [deprecated = true];

  // EXPERIMENTAL. DO NOT USE OR DEPEND ON THIS YET.
  //
  // "library" provides user-defined functions.
  //
  // Naming:
  //   * library.function.name are in a flat namespace.
  //     NOTE: We may need to change it to be hierarchical to support
  //     different orgs. E.g.,
  //     { "/google/nn", { ... }},
  //     { "/google/vision", { ... }}
  //     { "/org_foo/module_bar", { ... }}
  //     map<string, FunctionDefLib> named_lib;
  //   * If node[i].op is the name of one function in "library",
  //     node[i] is deemed as a function call. Otherwise, node[i].op
  //     must be a primitive operation supported by the runtime.
  //
  //
  // Function call semantics:
  //
  //   * The callee may start execution as soon as some of its inputs
  //     are ready. The caller may want to use Tuple() mechanism to
  //     ensure all inputs are ready in the same time.
  //
  //   * The consumer of return values may start executing as soon as
  //     the return values the consumer depends on are ready.  The
  //     consumer may want to use Tuple() mechanism to ensure the
  //     consumer does not start until all return values of the callee
  //     function are ready.
  FunctionDefLibrary library = 2;
};

op_def.proto定義

syntax = "proto3";

package tensorflow;
option cc_enable_arenas = true;
option java_outer_classname = "OpDefProtos";
option java_multiple_files = true;
option java_package = "org.tensorflow.framework";
option go_package = "github.com/tensorflow/tensorflow/tensorflow/go/core/framework";
import "tensorflow/core/framework/attr_value.proto";
import "tensorflow/core/framework/types.proto";

// Defines an operation. A NodeDef in a GraphDef specifies an Op by
// using the "op" field which should match the name of a OpDef.
// LINT.IfChange
message OpDef {
  // Op names starting with an underscore are reserved for internal use.
  // Names should be CamelCase and match the regexp "[A-Z][a-zA-Z0-9_]*".
  string name = 1;

  // For describing inputs and outputs.
  message ArgDef {
    // Name for the input/output.  Should match the regexp "[a-z][a-z0-9_]*".
    string name = 1;

    // Human readable description.
    string description = 2;

    // Describes the type of one or more tensors that are accepted/produced
    // by this input/output arg.  The only legal combinations are:
    // * For a single tensor: either the "type" field is set or the
    //   "type_attr" field is set to the name of an attr with type "type".
    // * For a sequence of tensors with the same type: the "number_attr"
    //   field will be set to the name of an attr with type "int", and
    //   either the "type" or "type_attr" field will be set as for
    //   single tensors.
    // * For a sequence of tensors, the "type_list_attr" field will be set
    //   to the name of an attr with type "list(type)".
    DataType type = 3;
    string type_attr = 4;    // if specified, attr must have type "type"
    string number_attr = 5;  // if specified, attr must have type "int"
    // If specified, attr must have type "list(type)", and none of
    // type, type_attr, and number_attr may be specified.
    string type_list_attr = 6;

    // For inputs: if true, the inputs are required to be refs.
    //   By default, inputs can be either refs or non-refs.
    // For outputs: if true, outputs are refs, otherwise they are not.
    bool is_ref = 16;
  };

  // Description of the input(s).
  repeated ArgDef input_arg = 2;

  // Description of the output(s).
  repeated ArgDef output_arg = 3;

  // Description of the graph-construction-time configuration of this
  // Op.  That is to say, this describes the attr fields that will
  // be specified in the NodeDef.
  message AttrDef {
    // A descriptive name for the argument.  May be used, e.g. by the
    // Python client, as a keyword argument name, and so should match
    // the regexp "[a-z][a-z0-9_]+".
    string name = 1;

    // One of the type names from attr_value.proto ("string", "list(string)",
    // "int", etc.).
    string type = 2;

    // A reasonable default for this attribute if the user does not supply
    // a value.  If not specified, the user must supply a value.
    AttrValue default_value = 3;

    // Human-readable description.
    string description = 4;

    // TODO(josh11b): bool is_optional?

    // --- Constraints ---
    // These constraints are only in effect if specified.  Default is no
    // constraints.

    // For type == "int", this is a minimum value.  For "list(___)"
    // types, this is the minimum length.
    bool has_minimum = 5;
    int64 minimum = 6;

    // The set of allowed values.  Has type that is the "list" version
    // of the "type" field above (uses the "list" field of AttrValue).
    // If type == "type" or "list(type)" above, then the "type" field
    // of "allowed_values.list" has the set of allowed DataTypes.
    // If type == "string" or "list(string)", then the "s" field of
    // "allowed_values.list" has the set of allowed strings.
    AttrValue allowed_values = 7;
  }
  repeated AttrDef attr = 4;

  // Optional deprecation based on GraphDef versions.
  OpDeprecation deprecation = 8;

  // One-line human-readable description of what the Op does.
  string summary = 5;

  // Additional, longer human-readable description of what the Op does.
  string description = 6;

  // -------------------------------------------------------------------------
  // Which optimizations this operation can participate in.

  // True if the operation is commutative ("op(a,b) == op(b,a)" for all inputs)
  bool is_commutative = 18;

  // If is_aggregate is true, then this operation accepts N >= 2
  // inputs and produces 1 output all of the same type.  Should be
  // associative and commutative, and produce output with the same
  // shape as the input.  The optimizer may replace an aggregate op
  // taking input from multiple devices with a tree of aggregate ops
  // that aggregate locally within each device (and possibly within
  // groups of nearby devices) before communicating.
  // TODO(josh11b): Implement that optimization.
  bool is_aggregate = 16;  // for things like add

  // Other optimizations go here, like
  //   can_alias_input, rewrite_when_output_unused, partitioning_strategy, etc.

  // -------------------------------------------------------------------------
  // Optimization constraints.

  // Ops are marked as stateful if their behavior depends on some state beyond
  // their input tensors (e.g. variable reading op) or if they have
  // a side-effect (e.g. printing or asserting ops). Equivalently, stateless ops
  // must always produce the same output for the same input and have
  // no side-effects.
  //
  // By default Ops may be moved between devices.  Stateful ops should
  // either not be moved, or should only be moved if that state can also
  // be moved (e.g. via some sort of save / restore).
  // Stateful ops are guaranteed to never be optimized away by Common
  // Subexpression Elimination (CSE).
  bool is_stateful = 17;  // for things like variables, queue

  // -------------------------------------------------------------------------
  // Non-standard options.

  // By default, all inputs to an Op must be initialized Tensors.  Ops
  // that may initialize tensors for the first time should set this
  // field to true, to allow the Op to take an uninitialized Tensor as
  // input.
  bool allows_uninitialized_input = 19;  // for Assign, etc.
};
// LINT.ThenChange(
//     https://www.tensorflow.org/code/tensorflow/core/framework/op_def_util.cc)

// Information about version-dependent deprecation of an op
message OpDeprecation {
  // First GraphDef version at which the op is disallowed.
  int32 version = 1;

  // Explanation of why it was deprecated and what to use instead.
  string explanation = 2;
};

// A collection of OpDefs
message OpList {
  repeated OpDef op = 1;
};

Hello, TensorFlow

A beginner-level, getting started, basic introduction to TensorFlow

TensorFlow is a general-purpose system for graph-based computation. A typical use is machine learning. In this notebook, we'll introduce the basic concepts of TensorFlow using some simple examples.

TensorFlow gets its name from tensors, which are arrays of arbitrary dimensionality. A vector is a 1-d array and is known as a 1st-order tensor. A matrix is a 2-d array and a 2nd-order tensor. The "flow" part of the name refers to computation flowing through a graph. Training and inference in a neural network, for example, involves the propagation of matrix computations through many nodes in a computational graph.

When you think of doing things in TensorFlow, you might want to think of creating tensors (like matrices), adding operations (that output other tensors), and then executing the computation (running the computational graph). In particular, it's important to realize that when you add an operation on tensors, it doesn't execute immediately. Rather, TensorFlow waits for you to define all the operations you want to perform. Then, TensorFlow optimizes the computation graph, deciding how to execute the computation, before generating the data. Because of this, a tensor in TensorFlow isn't so much holding the data as a placeholder for holding the data, waiting for the data to arrive when a computation is executed.

Adding two vectors in TensorFlow

Let's start with something that should be simple. Let's add two length four vectors (two 1st-order tensors):

[1.1.1.1.]+[2.2.2.2.]=[3.3.3.3.][1.1.1.1.]+[2.2.2.2.]=[3.3.3.3.]

In [3]:from __future__ import print_function

import tensorflow as tf

with tf.Session():

    input1 = tf.constant([1.0, 1.0, 1.0, 1.0])

    input2 = tf.constant([2.0, 2.0, 2.0, 2.0])

    output = tf.add(input1, input2)

    result = output.eval()

    print("result: ", result)

result:  [3. 3. 4. 3.]

What we're doing is creating two vectors, [1.0, 1.0, 1.0, 1.0] and [2.0, 2.0, 2.0, 2.0], and then adding them. Here's equivalent code in raw Python and using numpy:

In [0]:

print([x + y for x, y in zip([1.0] * 4, [2.0] * 4)])

[3.0, 3.0, 3.0, 3.0]

In [0]:

import numpy as np

x, y = np.full(4, 1.0), np.full(4, 2.0)

print("{} + {} = {}".format(x, y, x + y))

[ 1.  1.  1.  1.] + [ 2.  2.  2.  2.] = [ 3.  3.  3.  3.]

Details of adding two vectors in TensorFlow

The example above of adding two vectors involves a lot more than it seems, so let's look at it in more depth.

import tensorflow as tf

This import brings TensorFlow's public API into our IPython runtime environment.

with tf.Session():

When you run an operation in TensorFlow, you need to do it in the context of a Session. A session holds the computation graph, which contains the tensors and the operations. When you create tensors and operations, they are not executed immediately, but wait for other operations and tensors to be added to the graph, only executing when finally requested to produce the results of the session. Deferring the execution like this provides additional opportunities for parallelism and optimization, as TensorFlow can decide how to combine operations and where to run them after TensorFlow knows about all the operations.

input1 = tf.constant([1.0, 1.0, 1.0, 1.0])

input2 = tf.constant([2.0, 2.0, 2.0, 2.0])

The next two lines create tensors using a convenience function called constant, which is similar to numpy's array and numpy's full. If you look at the code for constant, you can see the details of what it is doing to create the tensor. In summary, it creates a tensor of the necessary shape and applies the constant operator to it to fill it with the provided values. The values to constant can be Python or numpy arrays. constant can take an optional shape parameter, which works similarly to numpy's fill if provided, and an optional name parameter, which can be used to put a more human-readable label on the operation in the TensorFlow operation graph.

output = tf.add(input1, input2)

You might think add just adds the two vectors now, but it doesn't quite do that. What it does is put the add operation into the computational graph. The results of the addition aren't available yet. They've been put in the computation graph, but the computation graph hasn't been executed yet.

result = output.eval()

print result

eval() is also slightly more complicated than it looks. Yes, it does get the value of the vector (tensor) that results from the addition. It returns this as a numpy array, which can then be printed. But, it's important to realize it also runs the computation graph at this point, because we demanded the output from the operation node of the graph; to produce that, it had to run the computation graph. So, this is the point where the addition is actually performed, not when add was called, as add just put the addition operation into the TensorFlow computation graph.

Multiple operations

To use TensorFlow, you add operations on tensors that produce tensors to the computation graph, then execute that graph to run all those operations and calculate the values of all the tensors in the graph.

Here's a simple example with two operations:

In [0]:

import tensorflow as tf

with tf.Session():

    input1 = tf.constant(1.0, shape=[4])

    input2 = tf.constant(2.0, shape=[4])

    input3 = tf.constant(3.0, shape=[4])

    output = tf.add(tf.add(input1, input2), input3)

    result = output.eval()

    print(result)

[ 6.  6.  6.  6.]

This version uses constant in a way similar to numpy's fill, specifying the optional shape and having the values copied out across it.

The add operator supports operator overloading, so you could try writing it inline as input1 + input2 instead as well as experimenting with other operators.

In [0]:

with tf.Session():

    input1 = tf.constant(1.0, shape=[4])

    input2 = tf.constant(2.0, shape=[4])

    output = input1 + input2

    print(output.eval())

[ 3.  3.  3.  3.]

Adding two matrices

Next, let's do something very similar, adding two matrices:

[1.1.1.1.1.1.]+[1.4.2.5.3.6.]=[2.5.3.6.4.7.][1.1.1.1.1.1.]+[1.2.3.4.5.6.]=[2.3.4.5.6.7.]

In [0]:

import tensorflow as tf

import numpy as np

with tf.Session():

    input1 = tf.constant(1.0, shape=[2, 3])

    input2 = tf.constant(np.reshape(np.arange(1.0, 7.0, dtype=np.float32), (2, 3)))

    output = tf.add(input1, input2)

    print(output.eval())

[[ 2.  3.  4.]
 [ 5.  6.  7.]]

Recall that you can pass numpy or Python arrays into constant.

In this example, the matrix with values from 1 to 6 is created in numpy and passed into constant, but TensorFlow also has range, reshape, and tofloat operators. Doing this entirely within TensorFlow could be more efficient if this was a very large matrix.

Try experimenting with this code a bit -- maybe modifying some of the values, using the numpy version, doing this using, adding another operation, or doing this using TensorFlow's range function.

Multiplying matrices

Let's move on to matrix multiplication. This time, let's use a bit vector and some random values, which is a good step toward some of what we'll need to do for regression and neural networks.

In [0]:

#@test {"output": "ignore"}

import tensorflow as tf

import numpy as np

with tf.Session():

    input_features = tf.constant(np.reshape([1, 0, 0, 1], (1, 4)).astype(np.float32))

    weights = tf.constant(np.random.randn(4, 2).astype(np.float32))

    output = tf.matmul(input_features, weights)

    print("Input:")

    print(input_features.eval())

    print("Weights:")

    print(weights.eval())

    print("Output:")

    print(output.eval())

Input:
[[ 1.  0.  0.  1.]]
Weights:
[[ 0.3949919  -0.83823347]
 [ 0.25941893 -1.58861065]
 [-1.11733329 -0.60435963]
 [ 1.04782867  0.18336453]]
Output:
[[ 1.44282055 -0.65486896]]

Above, we're taking a 1 x 4 vector [1 0 0 1] and multiplying it by a 4 by 2 matrix full of random values from a normal distribution (mean 0, stdev 1). The output is a 1 x 2 matrix.

You might try modifying this example. Running the cell multiple times will generate new random weights and a new output. Or, change the input, e.g., to [0 0 0 1]), and run the cell again. Or, try initializing the weights using the TensorFlow op, e.g., random_normal, instead of using numpy to generate the random weights.

What we have here is the basics of a simple neural network already. If we are reading in the input features, along with some expected output, and change the weights based on the error with the output each time, that's a neural network.

Use of variables

Let's look at adding two small matrices in a loop, not by creating new tensors every time, but by updating the existing values and then re-running the computation graph on the new data. This happens a lot with machine learning models, where we change some parameters each time such as gradient descent on some weights and then perform the same computations over and over again.

In [0]:

#@test {"output": "ignore"}

import tensorflow as tf

import numpy as np

with tf.Session() as sess:

    # Set up two variables, total and weights, that we'll change repeatedly.

    total = tf.Variable(tf.zeros([1, 2]))

    weights = tf.Variable(tf.random_uniform([1,2]))

    # Initialize the variables we defined above.

    tf.global_variables_initializer().run()

    # This only adds the operators to the graph right now. The assignment

    # and addition operations are not performed yet.

    update_weights = tf.assign(weights, tf.random_uniform([1, 2], -1.0, 1.0))

    update_total = tf.assign(total, tf.add(total, weights))

    for _ in range(5):

        # Actually run the operation graph, so randomly generate weights and then

        # add them into the total. Order does matter here. We need to update

        # the weights before updating the total.

        sess.run(update_weights)

        sess.run(update_total)

        print(weights.eval(), total.eval())

[[ -7.29560852e-05   8.01583767e-01]] [[ -7.29560852e-05   8.01583767e-01]]
[[ 0.64477301 -0.03944111]] [[ 0.64470005  0.76214266]]
[[-0.07470274 -0.76814342]] [[ 0.56999731 -0.00600076]]
[[-0.34230471 -0.42372179]] [[ 0.2276926  -0.42972255]]
[[ 0.67873812  0.65932178]] [[ 0.90643072  0.22959924]]

This is more complicated. At a high level, we create two variables and add operations over them, then, in a loop, repeatedly execute those operations. Let's walk through it step by step.

Starting off, the code creates two variables, total and weights. total is initialized to [0, 0] and weights is initialized to random values between -1 and 1.

Next, two assignment operators are added to the graph, one that updates weights with random values from [-1, 1], the other that updates the total with the new weights. Again, the operators are not executed here. In fact, this isn't even inside the loop. We won't execute these operations until the eval call inside the loop.

Finally, in the for loop, we run each of the operators. In each iteration of the loop, this executes the operators we added earlier, first putting random values into the weights, then updating the totals with the new weights. This call uses eval on the session; the code also could have called eval on the operators (e.g. update_weights.eval).

It can be a little hard to wrap your head around exactly what computation is done when. The important thing to remember is that computation is only performed on demand.

Variables can be useful in cases where you have a large amount of computation and data that you want to use over and over again with just a minor change to the input each time. That happens quite a bit with neural networks, for example, where you just want to update the weights each time you go through the batches of input data, then run the same operations over again.

What's next?

This has been a gentle introduction to TensorFlow, focused on what TensorFlow is and the very basics of doing anything in TensorFlow. If you'd like more, the next tutorial in the series is Getting Started with TensorFlow, also available in the notebooks directory.