Artificial Neural Networks
Artificial Neural Networks (ANN) are computational models inspired by biological neural networks and are used for approximation of functions. Artificial Neural Networks have graph theoretical representations where the nodes of the graph are also called the neurons and the edges of it are also known as synapses.
General Boltzmann Machines (GBM) are a type of Artificial Neural Networks where the neurons represent binary variables with a linear bias attached to them and every synapse between two neurons represents a quadratic term involving the binary variables associated to the neurons. In particular, there is a global energy function associated to the General Boltzmann Machine consisting of contributions from all the linear and quadratic terms.
General Boltzmann Machines are therefore graphical models used for approximating the joint distribution of dependent variables. The corresponding graph contains nodes referred to as visible nodes (or input variables), and non-visible nodes called hidden nodes (or latent variables). General Boltzmann Machines were developed to represent and solve certain combinatorial problems, and can be used as a probabilistic machine learning tool. The applications of General Boltzmann machine include but are not limited to visual object and speech recognition, classification, regression tasks, dimensionality reduction, information retrieval, and image reconstruction. For an overview of General Boltzmann Machines, see D. Ackley, G. Hinton, T. Sejnowski, “A Learning Algorithm for Boltzmann Machines,” Cognitive Science 9, 147-169 (1985).
The distribution approximation in General Boltzmann Machines is performed by encoding the dependent variables of interest as nodes of a larger graph. These nodes are the visible nodes and all the other nodes are the hidden nodes. There is a weight and a bias assigned to respectively every edge and vertex in the graph, and an energy function is assigned to the graph that depends on these weights and biases.
General Boltzmann Machines with arbitrary connections have not proven to be especially useful in a machine learning sense. This is due to the approximate learning method being slow, and especially in the cases where there are hidden units far from the visible units. When certain restrictions are made on the connection between hidden nodes, the General Boltzmann Machine neural network becomes more easily trained and useful for machine learning tasks. When no connections are allowed between hidden nodes and no connections are allowed between visible nodes, the resulting neural network is called a Restricted Boltzmann Machine (RBM), consisting only of one visible layer and one hidden layer.
With no intra-visible or intra-hidden node connections, efficient training algorithms have been developed that make Restricted Boltzmann Machines good performers in areas of machine learning through ease in learning probability distributions over a set of inputs on the visible layer. For applications, algorithms, and theory, see section 6 of Y. Bengio et al, “Representation Learning: A Review and New Perspectives, arXiv 2014—(http://www.cl.uni-heidelberg.de/courses/ws14/deepl/BengioETAL12.pdf)
The idea of the Restricted Boltzmann Machine has been pluralized in order to create more effective neural networks called Deep Boltzmann Machines (DBM). Deep Boltzmann Machines are created by stacking Restricted Boltzmann Machines on top of each other such that the hidden layer of the first Restricted Boltzmann Machine is used as the visible layer to the second Restricted Boltzmann Machine, the hidden layer of the second acts as the visible layer to the third Restricted Boltzmann Machine, and so on. This structure is studied extensively and is the basis of deep learning. The advantage of this structure is that the network weights and biases can be trained Restricted Boltzmann Machine by Restricted Boltzmann Machine, from the top down, using the same training algorithms developed for stand-alone Restricted Boltzmann Machines. For applications, algorithms, and theory behind Deep Boltzmann Machines, see: http://neuralnetworksanddeeplearning.com/chap6.html
The Restricted Boltzmann Machine by Restricted Boltzmann Machine approach to training a Deep Boltzmann Machine comes at the cost of accumulating errors which arise from approximations of distribution for each Restricted Boltzmann Machine.
Quantum Processors
A quantum processor is a quantum mechanical system of a plurality of qubits, measurements over which will result samples from the Boltzmann distribution defined by the global energy of the system.
Qubits are physical implementation of a quantum mechanical system represented on a Hilbert space and realizing at least two distinct and distinguishable eigenstates that represent two states of a quantum bit. A quantum bit is the analogue of the digital bit, where the ambient storing device may store two states |0 and |1 of a two-state quantum information, but also in superpositions α|0+β|1 of the two states. In various embodiments, such systems may have more than two eigenstates in which case the additional eigenstates are used to represent the two logical states by degenerate measurements. Various embodiments of implementations of qubits have been proposed: e.g., solid-state nuclear spins, measured and controlled electronically or with nuclear magnetic resonance, trapped ions, atoms in optical cavities (cavity quantum-electrodynamics), liquid state nuclear spins, electronic charge or spin degrees of freedom in quantum dots, superconducting quantum circuits based on Josephson junctions [Barone and Paterno, 1982, Physics and Applications of the Josephson Effect, John Wiley and Sons, New York; Martinis et al., 2002, Physical Review Letters 89, 117901] and electrons on Helium.
To each qubit is inductively coupled a source of bias called a local field bias. In one embodiment a bias source is an electromagnetic device used to thread a magnetic flux through the qubit to provide control of the state of the qubit [US 2006/0225165].
The local field biases on the qubits are programmable and controllable. In one embodiment, a qubit control system comprising a digital processing unit is connected to the system of qubits and is capable of programming and tuning the local field biases on the qubits.
A quantum processor may furthermore comprise of a plurality of couplings between a plurality of pairs of the plurality of qubits. A coupling between two qubits is a device in proximity of both qubits threading a magnetic flux to both qubits. In one embodiment, a coupling may consist of a superconducting circuit interrupted by a compound Josephson junction. A magnetic flux may thread the compound Josephson junction and consequently thread a magnetic flux on both qubits [US 2006/0225165]. The strength of this magnetic flux contributes quadratically to the energies of the quantum processor. In one embodiment, the coupling strength is enforced by tuning the coupling device in proximity of both qubits.
The coupling strengths are controllable and programmable. In one embodiment, a quantum device control system comprising of a digital processing unit is connected to the plurality of couplings and is capable of programming the coupling strengths of the quantum processor.
A quantum annealer is a quantum processor that carries quantum annealing as described, for example, in Farhi, E. et al., “Quantum Adiabatic Evolution Algorithms versus Simulated Annealing” arXiv.org: quant ph/0201031 (2002), pp. 1 16.
Quantum annealers perform a transformation of the quantum processor from an initial setup to a final one. The initial and final setups of the quantum processor provide quantum systems described by their corresponding initial and final Hamiltonians. For a quantum annealer with local field biases and couplings as described above a final Hamiltonian can be expressed as a quadratic function ƒ(x)=Σihixi+Σ(i,j)J(i,j)xixj where the first summation runs over an index i representing the qubits of the quantum annealer and the second summation is over pairs (i,j) for which there is a coupling between qubits i and j.
Quantum annealers can be used as heuristic optimizers of their energy function. An embodiment of such an analog processor is disclosed by McGeoch, Catherine C. and Cong Wang, (2013), “Experimental Evaluation of an Adiabatic Quantum System for Combinatorial Optimization” Computing Frontiers,” May 14 16, 2013 (http://www.cs.amherst.edu/ccm/cf14-mcgeoch.pdf) and also disclosed in the Patent Application US 2006/0225165.
With minor modifications to the quantum annealing process, quantum processors can instead be used to provide samples form the Boltzmann distribution of their energy function in a finite temperature. The reader is referred to the technical report: Bian, Z., Chudak, F., Macready, W. G. and Rose, G. (2010), “The Ising model: teaching an old problem new tricks”, and also Amin, M. H., Andriyash, E., Rolfe, J., Kulchytskyy, B., and Melko, R. (2016), “Quantum Boltzmann Machine” arXiv:1601.02036.
This method of sampling is called quantum sampling.
For a quantum processor with local field biases and couplings, quantum sampling provides samples from a distribution that is slightly different from the Boltzmann distribution of the quadratic function ƒ(x) introduced above.
The reference Amin, M. H., Andriyash, E., Rolfe, J., Kulchytskyy, B., and Melko, R. (2016), “Quantum Boltzmann Machine” arXiv:1601.02036 studies how far quantum sampling is from Boltzmann sampling.
Features of the invention will be apparent from review of the disclosure, drawings, and description of the invention below.