A neural network is generally an electronic system (usually implemented in software but may be a combination of hardware and software) for modeling or simulating the brain. All neural networks can be completely and uniquely specified by describing the four attributes of architecture, propagation rules/equations, activation rules/equations and learning rules/equations.
The architecture attribute specifies the organization of neurons or nodes (the fundamental processing element or unit of a neural network) into clusters, clusters into layers, and layers into the overall neural network. For many neural networks, clusters are not used; hence, the neurons are organized directly into layers. In addition, layers may be arranged in a hierarchy. To that end, the architecture attribute also describes the "permitted" flow of information or signals within the neural network by specifying actual (or rules for) physical (specific) and broadcast (non-specific) connectivity paths between layers, clusters, and/or neurons (nodes).
Propagation rules/equations provide a detailed description, usually mathematical, of information/signal flow for every permitted physical and broadcast connectivity path specified in the architecture. This includes initial conditions and evolution over time.
Activation rules/equations provide a detailed description, usually mathematical, of how each neuron specified in the architecture processes its information (signals) to produce output information (signals). This includes initial conditions and evolution over time. In a so called "winner take all" activation, for a given set of inputs to the network neurons, one and only one neuron outputs a logical one and all other neurons output a zero. In a "many-two-many" activation, several of the network neurons generate a non-zero output.
Learning rules/equations provide a detailed description, usually mathematical, of how each "persistent memory" in the network persists and/or changes over time. "Persistent memories" (commonly called weights or synapses) are those variables used in the propagation rules/equations or activation rules/equations, whose values must be retained for proper functioning of the neural network. This includes specification of all initial conditions for these variables.
Through learning, a neural network is trained so that application of a vector or set of inputs produces the desired (or at least consistent) set of outputs. Both output sets as well as inputs sets are referred to as vectors. Learning is usually accomplished by sequentially applying input vectors, while adjusting network weights according to the learning rules/equations. During learning, the network weights gradually converge to values that enable each input vector to produce the desired output vector.
Learning or training a neural network is said to be either "supervised" or "unsupervised". In supervised learning, an external "teacher" evaluates the behavior of the network and directs weight definitions accordingly. This is typically implemented by each input vector being paired with a target vector representing the desired output vector. Each pair of input vector and target vector is called a training pair. An input vector is applied, an output vector of the network is calculated and compared to the corresponding target vector, and the difference (error) is fed back through the network. As a result of this feedback, weights are changed according to the learning rules/equations which generally minimize the error. A sequence of such training pairs forms a training set. Each training pair (i.e., the vectors thereof) is applied to the neural network in sequence. The weights are adjusted for each applied vector until the error for the entire training set is below a threshold.
In unsupervised learning, there is no teacher, i.e., there are no target vectors and hence no comparison to predetermined outputs. Instead, the training set consists solely of input vectors. The learning rules/equations modify network weights (i.e., the network self-organizes) to produce output vectors that are consistent. Specifically, application of one of the training vectors produces the same output vector as application of an input vector sufficiently similar to the training vector. To that end, the learning process or training process extracts the statistical properties of the training set and groups similar input vectors into classes. Applying a vector from a given class to the network input will produce a specific output vector.