Artificial neuron

An artificial neuron is the basis for the model of artificial neural networks, a model of the neuro- computer science, which is motivated by biological neural networks. As a connectionist model, they form a network of artificial neurons and an artificial neural network can approximate so arbitrarily complex functions, learning tasks and solve problems in which an explicit modeling is difficult to not perform. Examples include facial and voice recognition.

Emerged as a model of the biological model of the nerve cell, it can handle multiple inputs and react accordingly on its activation. For this purpose, the inputs are weighted passed to an output function, which calculates the neuron activation. Their behavior is given to them in general by learning using a learning method.

  • 5.1 Representation of Boolean functions
  • 5.2 Programming a neuron

History

The beginnings of artificial neurons are due to Warren McCulloch and Walter Pitts in 1943. They show in a simple model of a neural network, McCulloch - Pitts of the cell that this logical and arithmetic functions can be calculated.

The Hebbian learning rule is described in 1949 by Donald Hebb. Based on medical research by Santiago Ramón y Cajal, who in 1911 proved the existence of synapses, according to this rule repeatedly active connections between neurons are strengthened. The generalization of this rule is still used in today's learning process.

An important work comes in 1958 with the Konvergenztheorem out over the perceptron. There is Frank Rosenblatt that it can be taught all the solutions with the specified learning methods that are representable with this model.

However, to show the critic Marvin Minsky and Seymour Papert in 1969 that a single-stage perceptron an XOR operation can not represent because the XOR function linearly separable ( linearly separable ) is not only later models can remedy this situation. The boundary as shown in the modeling leads first to a decreasing interest in the exploration of artificial neural networks as well as in the removal of research funds.

An interest in artificial neural networks only comes on again when John Hopfield Hopfield networks makes the 1985 known and shows that they are able to solve optimization problems like the traveling salesman problem ends. Also, the work for back propagation method by David E. Rumelhart, Geoffrey E. Hinton and Ronald J. Williams leads from 1986 to a revival of the study of these networks.

Today, these networks are used in many research areas.

Biological motivation

Motivated are artificial neurons through the nerve cells of mammals that specialize in the acquisition and processing of signals. About synaptic signals are electrically or chemically to other nerve cells or effector cells (eg muscle contraction) forwarded.

A neuron consists of a cell body, the dendrites and the axon. Dendrites are short cell processes, the branches strongly ensure the recording of signals from other nerve cells or sensory cells. The axon functions as a signal output of the cell and can reach a length up to 1 m. The transition of the signals at the synapses which can be excitatory or inhibitory effect.

The dendrites of the nerve cell transfer your incoming electrical impulses to the cell body further. Achieved the excitation exceeds a certain threshold and it will discharge the voltage and propagates through the axon ( all-or -nothing law ).

The interconnection of these neurons is the basis for the spiritual power of the brain. The central nervous system of man consists, according to estimates made ​​by nerve cells that have an average of 10,000 compounds - that the human brain can hold more than connections. The action potential in the axon can propagate at a speed up to 100 m / s.

Compared to logic gates also reflects the efficiency of neurons. While gate in the range of nanoseconds ( 10-9) switch, under an energy consumption of 10-6 joules ( 1991 data ), nerve cells respond in the millisecond range ( 10-3) and only consume an energy of 10-16 joules. Despite the apparently lower values ​​in the processing by nerve cells computer-aided systems can not come close to the capabilities of biological systems.

The performance of neural networks is also demonstrated by the 100 -step rule: The visual recognition in humans takes place in a maximum of 100 parallel processing steps - the most sequentially operating computers do not provide comparable performance.

The advantages and properties of neurons motivate the model of artificial neurons. Many models and algorithms to artificial neural networks still lack a direct plausible biological motivation. There, you will find this only in the spirit of the abstract modeling of the nerve cell.

Modeling

The biology as a model for a usable information technology solution will be found by an appropriate modeling. By a rough generalization, the system is simplified - while maintaining the essential characteristics.

The synapses of the nerve cell can be mapped in this case by the addition of weighted inputs, the activation of the cell nucleus by activating function with threshold value. The use of an adder and threshold already found as in McCulloch and Pitts cell by 1943.

Components

An artificial neuron can be described by four basic elements:

Through a connection graph, the following are defined:

Mathematical definition

The artificial neuron model is introduced as usually in the following way in the literature:

First, the power input of the artificial neuron is

Defined, and thus the activation by

It is

On- neuron

Alternatively, the threshold may also be represented by adding an additional input, a so-called on- neuron or bias. This has the constant value. The threshold is then the weight of this input. A special treatment of the threshold can thus be omitted and simplified treatment in the learning rules.

In addition to the true inputs is now that of the on- neuron included in the calculation:

Upon activation can thus be dispensed with a special treatment of the threshold value:

Activation functions

  • As well as

As activation function, different types of functions can be used, depending on the network topology used. Such a function can be non- linear, sigmoid, for example, be piecewise linear or a step function. In general activation functions are monotonically increasing.

Linear activation functions are subject to severe restriction, since a composition of linear functions can be represented by arithmetic transformations by a single linear function. For multilayer interconnection networks are therefore not suitable and find only in simple models application.

Examples of basic activation functions are:

  • Threshold function: the threshold function (german hard limit ), as defined below, and will only accept the values. The value 1 for input, otherwise. In the subtractive using a threshold value, the function will activate only when the additional input exceeds the threshold. A neuron with such a function is also called McCulloch - Pitts cell. It reflects the all-or -nothing property of the model.

A neuron with this activation function is presented as such:

  • Piecewise linear function: The piecewise linear function used here (English piecewise linear) forms a limited interval from linear, the outer intervals are mapped to a constant value:

A neuron with piecewise linear function as activation function is also represented as follows:

  • Sigmoid function: sigmoid functions as activation function are pictures very frequently used. They have, as defined herein, a variable pitch dimension that influences the curvature of the function graphs. A special feature is their differentiability needed for some process such as the back propagation algorithm:

The values ​​of the above functions are in the interval. For the interval, these functions can be defined accordingly. A neuron with sigmoid function is presented as such:

Examples

Representation of Boolean functions

With artificial neurons located Boolean functions can be represented. So can be represented the three functions conjunction ( and), disjunction ( or) and negation (not) using a threshold function as follows:

For the conjunction, for example, it can be seen that only for the boolean input and activation

Reveals otherwise.

Teach a neuron

Unlike the previous example, in which the matching weights were selected neurons can learn to function representing. The weights and the threshold value are initially assigned random values ​​and then adjusted using a " trial and error " learning algorithm.

To learn the logical conjunction, the perceptron criterion function can be applied. It adds the values ​​misrecognized inputs added to the weighting in order to improve recognition until all possible inputs are correctly classified. The activation function here is similar to the previous example, the threshold function.

For the learning process, the learning rate, which determines the speed of the teach- elected with. This eliminates an explicit mention.

Instead of specifying the threshold as such, is an on- Neuron, that is added to a constant input. The threshold value is given by the weighting.

In order to train the neuron to the two possible outputs and the conjunction, the entries for the associated output with be multiplied. The output is through this step only if the entry in question has been classified incorrectly. This approach simplifies the analysis when teaching and subsequent weight adjustment. After the learning table looks like this:

The input has in the inputs of the value in which the neuron output is at the end.

For the initial situation, the weights are chosen randomly:

To test this the weights are used in a neuron with three inputs and a threshold value. For the chosen weights, the output looks like this:

The first and third input are calculated incorrectly and the neuron outputs. Now the perceptron criterion function finds its application:

By adding the misrecognized inputs the corresponding weights are

With

The review by the weight change shows that instead of the first and third input is now the fourth input is misclassified. The execution of a further step of the learning method improves the detection function of the neuron:

Now you can see that the neuron has learned the predetermined function and calculates all four inputs correctly.

And using the input and the selection of the activation now follows:

For the other three entries that have been multiplied for teaching with, now results in the value. It follows from the input and the activation:

Without imposing specific weights learned on the basis of specifications represent the conjunction as in the first example, the neuron.

38930
de