.. vim: set fileencoding=utf-8 : .. Tue 13 May 09:30:06 2014 CEST .. testsetup:: * import numpy import bob.learn.mlp import tempfile import os current_directory = os.path.realpath(os.curdir) temp_dir = tempfile.mkdtemp(prefix='bob_doctest_') os.chdir(temp_dir) ==================================================== Multi-Layer Perceptron (MLP) Machines and Trainers ==================================================== A `multi-layer perceptron `_ (MLP) is a neural network architecture that has some well-defined characteristics such as a feed-forward structure. You can create a new MLP using one of the trainers described below. We start this tutorial by examplifying how to actually use an MLP. To instantiate a new (uninitialized) :py:class:`bob.learn.mlp.Machine` pass a shape descriptor as a :py:class:`tuple`. The shape parameter should contain the input size as the first parameter and the output size as the last parameter. The parameters in between define the number of neurons in the hidden layers of the MLP. For example ``(3, 3, 1)`` defines an MLP with 3 inputs, 1 single hidden layer with 3 neurons and 1 output, whereas a shape like ``(10, 5, 3, 2)`` defines an MLP with 10 inputs, 5 neurons in the first hidden layer, 3 neurons in the second hidden layer and 2 outputs. Here is an example: .. doctest:: >>> mlp = bob.learn.mlp.Machine((3, 3, 2, 1)) As it is, the network is uninitialized. For the sake of demonstrating how to use MLPs, let's set the weight and biases manually (we would normally use a trainer for this): .. doctest:: >>> input_to_hidden0 = numpy.ones((3,3), 'float64') >>> numpy.allclose(input_to_hidden0,[[ 1., 1., 1.], [ 1., 1., 1.],[ 1., 1., 1.]]) True >>> hidden0_to_hidden1 = 0.5*numpy.ones((3,2), 'float64') >>> numpy.allclose(hidden0_to_hidden1, [[ 0.5, 0.5],[ 0.5, 0.5],[ 0.5, 0.5]]) True >>> hidden1_to_output = numpy.array([0.3, 0.2], 'float64').reshape(2,1) >>> numpy.allclose(hidden1_to_output, [[ 0.3], [ 0.2]]) True >>> bias_hidden0 = numpy.array([-0.2, -0.3, -0.1], 'float64') >>> bias_hidden0 array([-0.2, -0.3, -0.1]) >>> bias_hidden1 = numpy.array([-0.7, 0.2], 'float64') >>> bias_hidden1 array([-0.7, 0.2]) >>> bias_output = numpy.array([0.5], 'float64') >>> numpy.allclose(bias_output, [ 0.5]) True >>> mlp.weights = (input_to_hidden0, hidden0_to_hidden1, hidden1_to_output) >>> mlp.biases = (bias_hidden0, bias_hidden1, bias_output) At this point, a few things should be noted: 1. Weights should **always** be 2D arrays, even if they are connecting 1 neuron to many (or many to 1). You can use the NumPy_ ``reshape()`` array method for this purpose as shown above 2. Biases should **always** be 1D arrays. 3. By default, MLPs use the :py:class:`bob.learn.activation.HyperbolicTangent` as activation function. There are currently 4 other activation functions available in |project|: * The identity function: :py:class:`bob.learn.activation.Identity`; * The sigmoid function (also known as the `logistic function `_ function): :py:class:`bob.learn.activation.Logistic`; * A scaled version of the hyperbolic tangent function: :py:class:`bob.learn.activation.MultipliedHyperbolicTangent`; and * A scaled version of the identity activation: :py:class:`bob.learn.activation.Linear` Let's try changing all of the activation functions to a simpler one, just for this example: .. doctest:: >>> mlp.hidden_activation = bob.learn.activation.Identity() >>> mlp.output_activation = bob.learn.activation.Identity() Once the network weights and biases are set, we can feed forward an example through this machine. This is done using the ``()`` operator, like for a :py:class:`bob.learn.linear.Machine`: .. doctest:: >>> numpy.allclose(mlp(numpy.array([0.1, -0.1, 0.2], 'float64')), [ 0.33]) True MLPs can be `trained` through backpropagation [2]_, which is a supervised learning technique. This training procedure requires a set of features with labels (or targets). Using |project|, this is passed to the `train()` method of available MLP trainers in two different 2D `NumPy`_ arrays, one for the input (features) and one for the output (targets). The number of rows in those two 2D arrays should be equal to the batch size set when creating the model. .. doctest:: :options: +NORMALIZE_WHITESPACE >>> d0 = numpy.array([[.3, .7, .5]]) # input >>> t0 = numpy.array([[.0]]) # target The class used to train a MLP [1]_ with backpropagation [2]_ is :py:class:`bob.learn.mlp.BackProp`. An example is shown below. .. doctest:: :options: +NORMALIZE_WHITESPACE >>> trainer = bob.learn.mlp.BackProp(1, bob.learn.mlp.SquareError(mlp.output_activation), mlp, train_biases=False) # Creates a BackProp trainer with a batch size of 1 >>> trainer.train(mlp, d0, t0) # Performs the Back Propagation .. note:: The second parameter of the trainer defines the cost function to be used for the training. You can use two different types of pre-programmed costs in |project|: :py:class:`bob.learn.mlp.SquareError`, like before, or :py:class:`bob.learn.mlp.CrossEntropyLoss` (normally in association with :py:class:`bob.learn.activation.Logistic`). You can implement your own cost/loss functions. Nevertheless, to do so, you must do it using our C/C++-API and then bind it to Python in your own package. Backpropagation [2]_ requires a learning rate to be set. In the previous example, the default value ``0.1`` has been used. This might be updated using the :py:attr:`bob.learn.mlp.BackProp.learning_rate` attribute. Another training alternative exists referred to as **resilient propagation** (R-Prop) [3]_, which dynamically computes an optimal learning rate. The corresponding class is :py:class:`bob.learn.mlp.RProp`, and the overall training procedure remains identical. .. doctest:: :options: +NORMALIZE_WHITESPACE >>> trainer = bob.learn.mlp.RProp(1, bob.learn.mlp.SquareError(mlp.output_activation), mlp, train_biases=False) >>> trainer.train(mlp, d0, t0) .. note:: The trainers are **not** re-initialized when you call it several times. This is done so as to allow you to implement your own stopping criteria. To reset an MLP trainer, use their ``reset`` method. .. Place here your external references .. include:: links.rst .. [1] http://en.wikipedia.org/wiki/Multilayer_perceptron .. [2] http://en.wikipedia.org/wiki/Backpropagation .. [3] http://en.wikipedia.org/wiki/Rprop