A number of technological and economic pressures currently exist to develop a new type of electronics. Often-cited Moore's law gives us hope for optimism. Moore's second law, however, is making it clear that a transition is upon us. As our devices approach the atomic scale they become noisy and prone to faults in production and use. Opposite to the consumer trend of price reduction, the costs for producers to fulfill Moore's law are increasing dramatically. At the same time it is becoming increasingly dearly that current computing approaches are not going to meet the challenges we face in adaptive autonomous controllers. The power-discrepancy between biological solutions and our advanced computing systems is so large that it points to a flaw in our notions of what computing is. FIG. 1 illustrates a graphic depicting data indicating that it is not physically feasible to simulate biology at even moderate fidelity. A simple thought experiment also illustrates this point.
Suppose we were to simulate the human body at a moderate fidelity such that each cell of the body was allocated to one CPU, and that the distance between memory and processor was distance d. At an operating voltage V=1 and d=1cm, this simulation would consume at minimum 100GW of power, or about the total peak power consumption of France, as indicated by the following formulation:
            10      4        ⁢          variable      cell        ×    1    ⁢                  ⁢          bit      variable        ×          10      14        ⁢          cell      human        ×          10      5        ⁢          update      sec        =                    10        23            ⁢                        bit          ·          update                          human          ·          sec                    ⁢              energy                  bit          ·          update                      =                            CV          2                2            =                        10                      -            10                          ⁢                  dV          2                ⁢                  Joules                      bit            ·            update                          ⁢                  Joules                      human            ·            sec                          ⁢                  10                      -            10                          ⁢                  V          2                ⁢                  Joules                      bit            ·            update                          ×                  10          23                ⁢                              bit            ·            update                                human            ·            sec                              
If we lowered the voltage to the thermodynamic limit of V=0.025V (kT at room temperature) and the CPU-memory distance to the diameter of an average cell, d=10−5 m, it would still consume 62.5 kW, which is 625 times as much energy as is actually consumed by the human body. Turning the problem around, we can ask just how small a distance between memory and process would be required if we set the operating voltage to 70 mV, the resting potential of a neuron. The distance between the CPU and memory would need to be 2 nm or less for the simulation to equal the efficiency of biology. If these numbers seem unbelievable, we can forgo the thought experiment and point to actual data. Consider IBM's recent cat-scale cortical simulation of 1 billion neurons and 10 trillions synapses.
This effort required 147,456 CPU's and ran at 1/100th real time. At a power consumption of 20 W per CPU, this is 3 megawatts. If we presume perfect scaling, a real-time simulation would consume 100× more power: 300 megawatts. A human brain is ˜20 times larger than a cat, so that a real-time simulation of a network at the scale of a human would consume 6 GW if done with traditional serial processors. This is 600 million times more energy than a human brain actually dissipates. The cortex represents a fraction of the total neurons in a brain, neurons represent a fraction of the total cells in a brain, and the IBM neuron model was extremely simplified. The number of adaptive variables under constant modification in the IBM simulation is orders of magnitude less than the biological counterpart and yet its power dissipation is orders of magnitude larger.
If our aim is the creation of computing systems with power efficiencies comparable to biology, as we will need if we want autonomous controllers, then we cannot compute in the traditional sense. We cannot simulate a brain. We must build a brain. There is no distinction between memory and processing in living systems and brains, and it is exactly this distinction that is at the heart of our problems. Our solution is to define a new type of computing based on the self-organization of nature. Nature is capable of building structures of far greater complexity than any modern chip, and it is capable of doing it while embedded in the real world, not a clean room.
If the principles of autonomous self-organization were illuminated it would cascade through all parts of our world economy. Self-organizing circuits would dramatically reduce the cost of fabrication by increasing yields, as circuits could adapt around faults. The ability to heal, a natural consequence of attractor-based self-organization, leads to enhanced survival in hostile environments. However, these are just some of the peripheral benefits. Consider that every CPU currently in existence requires a program that was created by a brain: a self-organizing autonomous control system. Any application that must interact with a complex changing environment is a potential platform for self-organizing autonomous control circuitry.
The solution to our problem is all around us in nature, which displays a most remarkable property. The atoms in our bodies recycle in a matter of months. Despite the fact that life is inherently volatile, it can maintain it structure and fight decay so long as energy is dissipated. It is this property of self-repair that is at the heart of self-organization. Indeed, if a system was capable of self-repair then it should be capable of self-organization, since repair of structure is the same thing as building a structure. We can accomplish this incredible feat through the use of attractor dynamics. Just as a ball will roll into a depression, an attractor-based system will inevitable fall into its attractor. Perturbations will be quickly “fixed” as the system re-converges to its attractor. If we cut ourselves we heal. To bestow this property on our computing technology we must find a way to represent our computing structures as a fixed-point attractor. To understand how to solve the problem, we must first understand what sort of attractors we need.
Two types of attractors exist that could offer a solution to this solicitations stated problem, intrinsic and extrinsic. Extrinsic attractors are most suitable for information-processing systems. An example of an intrinsic attractor is the famous Lorenz attractor. Three partial differential equations with three constants are iterated in time, producing incremental advances in the x, y and z position of a particle. Over time, this particle traces out the familiar “butterfly wings” strange attractor seen in Error: Reference source not found. The Lorenz attractor displays its dynamics without the influence of an outside force. Energy is expended in evolving the system in time, but the nature of this evolution is governed exclusively by the intrinsic properties of the Lorenz equation. An example of intrinsic attractor in Nature would be, for example, the body of an organism. The intrinsic attractor that builds the body is specified by the intrinsic information of the DNA and will evolve in time toward a fixed-point. That is, an organism could be grown in two very different environments but will still evolve in time to have the same body configuration.
This is of course not true with a brain, which is an example of an extrinsic attractor. The structure of the brain is a reflection of the structure of the information it is processing. Another example of an extrinsic attractor is a fast-flowing river. The structure of the rapids is created from the water flowing over the streambed. Countless molecules of water come and go, but the structure of the rapids remains the same. Without the underlying streambed, however, the structure would quickly dissipate. Three ingredients are necessary for an extrinsic attractor. First, energy must be dissipated. In the river this is provided by the gravitation gradient. Second, the water must interact with itself and the environment (the stream bed) according to a plasticity rule. In the river, the inter-molecular forces of water provide this. Third, there must be external structure. This is the streambed.
The present inventor has identified a non-linear plasticity rule referred to as anti-Hebbian and Hebbian (AHAH) learning and has demonstrated that its attractor states are a reflection of the underlying structure of the information. The present inventor has shown that the attractor states represent logic functions that form a universal set and they correspond to points of maximal support vectors, which allows for optimally extracting patter regularities or features. AHAH generally refers to “Anti-Hebbian and Hebbian”. Hence, “AHAH plasticity” refers to “Anti-Hebbian and Hebbian plasticity”.
One non-limiting example of an application of an AHAH plasticity rule is disclosed in U.S. Pat. No. 7,398,259 entitled “Training of a Physical Neural Network,” which is incorporated herein by reference. Another non-limiting example of an AHaH plasticity rule is disclosed in U.S. Pat. No. 7,409,375 entitled “Plasticity-induced Self Organizing Nanotechnology for the Extraction of Independent Components from a Data Stream,” which is also incorporated herein by reference. A further non-limiting example of an AHAH plasticity rule is disclosed in U.S. Pat. No. 7,412,428 entitled “Application of Hebbian and Anti-Hebbian Leaning to Nanotechnology-Based Physical Neural Networks,” which is incorporated herein by reference.
An additional non-limiting example of an AHAH plasticity rule is disclosed in U.S. Pat. No. 7,420,396 entitled “Universal Logical Gate Utilizing Nanotechnology,” which is incorporated herein by reference. Another non-limiting example of an AHAH plasticity rule is disclosed in U.S. Pat. No. 7,502,769 entitled “Fractal Memory and Computational Methods and Systems Based on Nanotechnology,” which is incorporated herein by reference. A further non-limiting example of an AHAH plasticity rule is disclosed in U.S. Pat. No. 7,599,895 entitled “Methodology for the Configuration and Repair of Unreliable Switching Elements,” which is incorporated herein by reference. Another non-limiting example of an AHAH plasticity rule is disclosed in U.S. Pat. No. 7,827,130 entitled “Fractal Memory and Computational Methods and Systems Based on Nanotechnology”.
An additional non-limiting example of an AHAH plasticity rule is disclosed in U.S. Pat. No. 7,930,257 entitled “Hierarchical Temporal Memory Utilizing Nanotechnology”. A further non-limiting example of an AHAH plasticity rule is disclosed in U.S. Pat. No. 8,022,732 entitled “Universal Logic Gate Utilizing Nanotechnology”. Another example of an AHAH plasticity rule is disclosed in U.S. Pat. No. 8,041,653 entitled “Method and System for a Hierarchical Temporal Memory Utilizing a Router Hierarchy and Hebbian and Anti-Hebbian Learning,” which is incorporated herein by reference.
The present inventor has designed a number of artificial neural network and Al synaptic solutions, methods, systems and devices. Non-limiting examples of artificial network, synaptic and other Al solutions are disclosed in the following issued patents, which are incorporated herein by reference: U.S. Pat. Nos. 6,889,216; 6,995,649; 7,028,017; 7,039,619; 7,107,252; 7,392,230; 7,398,259; 7,409,375; 7,412,428; 7,420,396; 7,426,501; 7,502,769; 7,599,895; 7,752,151; 7,827,130; 7,827,131; 7,930,257; 8,022,732; 8,041,653; 8,156,057; 8,311,958; and 8,332,339.
One of the problems with current processing and memory based computing systems is the power consumed and the communication burden. Reducing the communication burden of the system is important as it will vastly reduce the total consumed power. Also, the ability to efficiently and quickly grow effective procedures or algorithms is a much sought after feature that has yet to be implemented based on current computing paradigms and approaches. It is therefore believed that a need exists for a new approach to computing, which reduces power consumption and the communication burden while vastly increasing speed and processing power. Such an approach is described in greater detail herein.
A number of technological and economic pressures currently exist in the development of new types of electronics. Recent advancements in quantum computing, MEMS, nanotechnology, and molecular and memristive electronics offer new and exciting avenues for extending the limitations of conventional von Neumann digital computers. As device densities increase, the cost of R&D and manufacturing has skyrocketed due to the difficulty of precisely controlling fabrication at such a small scale. New computing architectures are needed to ease the economic pressures described by what has become known as Moore's second law: The capital costs of semiconductor fabrication increases exponentially over time. We expend enormous amounts of energy constructing the most sterile and controlled environments on earth to fabricate modern electronics. Life however is capable of assembling and repairing structures of far greater complexity than any modern chip, and it is capable of doing so while embedded in the real world, and not a clean room.
IBM's cat-scale cortical simulation of 1 billion neurons and 10 trillion synapses, for example, required 147,456 CPUs, 144 TB of memory, and ran at 1/83rd real time. At a power consumption of 20 W per CPU, this is 2.9 MW. If we presume perfect scaling, a real-time simulation would consume 83× more power or 244 MW. At roughly thirty times the size of a cat cortex, a human-scale cortical simulation would reach over 7 GW. The cortex represents a fraction of the total neurons in a brain, neurons represent a fraction of the total cells, and the IBM neuron model was extremely simplified. The number of adaptive variables under constant modification in the IBM simulation is orders of magnitude less than the biological counterpart and yet its power dissipation is orders of magnitude larger. The power discrepancy is so large it calls attention not just to a limit of our current technology but also to a deficiency in how we think about computing.
Brains have evolved to move bodies through a complex and changing world. In other words, brains are both adaptive and mobile devices. If we wish to build practical artificial brains with power and space budgets approaching biology we must merge memory and processing into a new type of physically adaptive hardware.