Many internet articles deal with how to setup the training of a neural network in Python/Keras, but what if we want to create our own neural network, manually setting the weights? Alternatively, what if we want to see the output of the hidden layers of our model? This blog post will go into those topics.
Neural networks are typically represented by graphs in which the input of the neuron is multiplied by a number (weights) shown in the edges. Every layer has an additional input neuron whose value is always one and is also multiplied by a weight (bias). After the multiplication, all inputs are summed and used as input to a function (activation function) that gives the output of the neuron. Finding weights and bias is the task of the training of a neural network. However, in this article, we already have the weights, and we want to set Keras to use them.
We will create a neural network that implements the XOR operation, such as the one shown in Section 6.1 of the deep learning book [1], the weights and bias of such neural network are shown in the introduced figure.
We will define a neural network following the standard Keras API. The code below defines a neural network and adds four layers to it (in Keras the activation is implemented as a layer). The first layer (orange neurons in the figure) will have an input of 2 neurons and an output of two neurons; then a rectified linear unit will be used as the activation function. The output layer will have one output neuron and two inputs (the input does not have to be defined, Keras infers it from the output of the previous layer), which will then pass through a linear activation function to give the output.
import numpy as np import keras model = keras.models.Sequential() model.add(keras.layers.Dense(2, input_shape=(2,))) # hidden layer model.add(keras.layers.Activation('relu')) # hidden layer ReLU model.add(keras.layers.Dense(1)) # Output layer model.add(keras.layers.Activation('linear')) # Output layer linear activation
Keras expects the weights as a matrix in which columns corresponds to neurons of the layer and lines to neuron’s input; and an additional line vector that represents the bias for each neuron. In this model’s hidden layer, we will have to define two weights for the input and one bias for each of the two neurons; furthermore, for the output layer, we define two weights and one bias for the only output neuron. Therefore, we need for the hidden layer a 2×2 matrix, and a 2-column vector; and for the output layer, a 2×1 matrix and a 1-column vector.
w1 = np.zeros((2, 2)) # two input neurons for two neurons at the hidden layer b1 = np.zeros((2,)) # one bias neuron for two neurons in the hidden layer w2 = np.zeros((2, 1)) # two input neurons for one output neuron b2 = np.zeros((1,)) # one bias for one output neuron
Afterward, we set the individual weights and biases; they are assigned according to the figure.
w1[:, 0] = 1 # the weights for the first hidden neuron are all 1 b1[0] = 0 # bias for the first neuron w1[:, 1] = 1 # the weights for the second hidden neuron are all 1 b1[1] = -1 # bias for the second neuron w2[0, 0] = 1 # weight for the first input of the output neuron w2[1, 0] = -2 # weight for the second input of the output neuron b2[0] = 0 # bias for the output neuron
Finally, Keras expects the weight of each layer to be assigned sequentially in a list.
model.set_weights([w1, b1, w2, b2])
That is all! We have a neural network that implements the XOR operation. We can offer all possible inputs to compute the truth table and validate our model.
x = np.array([ [0, 0], [0, 1], [1, 0], [1, 1], ]) model.predict(x)
The code should output the array below, which corresponds the correct output when evaluating the XOR operation.
array([[0.], [1.], [1.], [0.]], dtype=float32)
One last curiosity is looking into the intermediate layers, i.e., the values computed in each layer added with model.add in the first step. The code below accesses the output of the first Keras hidden layer, which is the input for the hidden activation function (second Keras layer).
hidden_layers = keras.backend.function( [model.layers[0].input], # we will feed the function with the input of the first layer [model.layers[0].output,] # we want to get the output of the first layer ) hidden_layers([x])
Running this code outputs the data below, which is the sum of each input for neuron 1; for neuron 2 it also the sum of inputs, but subtracted by 1 (bias), as indicated by the first figure.
[array([[ 0., -1.], [ 1., 0.], [ 1., 0.], [ 2., 1.]], dtype=float32)]
The next layer is the activation function (ReLU), which filters out any negative value, this can be observed with the following snippet, which just changed the second parameter to keras.backend.function so that it shows the output of the second Keras layer (the hidden layer in the figure).
hidden_layers = keras.backend.function( [model.layers[0].input], # we will feed the function with the input of the first layer [model.layers[1].output,] # we want to get the output of the second layer ) hidden_layers([x])
The output, as expected, is the input (previous shown output) with its negative values filtered out (ReLU).
[array([[0., 0.], [1., 0.], [1., 0.], [2., 1.]], dtype=float32)]
This closes out our inside look into Keras models.
[1] http://www.deeplearningbook.org/
[2] https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer
[3] https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model