This disclosure relates to factor graphs, and in particular, to the design and implementation of factor graphs in software and/or hardware.
At a high level, circuitry representing a belief propagation network can be represented as a factor graph having nodes connected by edges. Factor graphs are a powerful tool for describing statistical models such as hidden Markov models, Markov random fields, Kalman filters, and Bayes nets.
Given a set of n discrete random variables a1, a2, . . . , an the joint probability distribution is expressed as p(a1, a2, . . . , an). Suppose that the joint probability distribution factors in the following sense: there exist subsets S1, . . . , Sk⊂{1, 2, . . . , n} where Sj={s1j, . . . , st(j)j}, such that p(an, . . . , an)=Πj=1kƒj(as1j . . . , ast(j)j). For example, if the ai form a Markov chain, then the joint probability can be factored as p(a1, . . . , an)=p(a1)Πp(aj+1|aj)=ƒ0(a1)Πƒj(aj, aj+1)
The factors above are normalized, in the sense that as the ai vary, the probabilities sum to one. The factors can be defined more generally such that they are only required to be proportional to the joint probability. So, we call the ƒi a collection of factors of p( ) if
      p    ⁡          (                        a          1                ,                  a          2                ,        …        ⁢                                  ,                  a          n                    )        ∝            ∏              j        =        1            k        ⁢                  f        i            ⁡              (                              a                          s              1              j                                ,                      a                          s              2              j                                ,          …          ⁢                                          ,                      s                          t              ⁡                              (                j                )                                      j                          )            
The product of the factors then differs from the joint probability only by multiplication by a normalizing constant.
When a probability distribution can be expressed as a product of small factors (i.e. |Sj| is small for all j) it is possible to invoke a host of powerful tools for modeling and inference.
Given a factored representation of a joint probability distribution, it is possible to describe the structure of the factors as a graph. Each variable ai and each function ƒj can be represented by a node in the graph. An undirected edge can be placed between node ai and node ƒj if and only if the variable ai is an argument in the function ƒj. These two types of nodes are referred to as function nodes (which in some contexts may also be referred to as factors or factor nodes) and variable nodes (which may also be referred to as equals nodes or equals processors in some contexts). Because all edges lie between the two disjoint classes of nodes, the resulting graph is bipartite. This graph is called a factor graph.
FIG. 1 is a graphical representation of a factor graph of the Markov chain expressed above in Equation 1, where ƒj(aj, aj+1)=p(aJ+1|aj). It is clear from the figure that the bipartite factor graph includes two groups of nodes (i.e., a1 . . . a4 and ƒ0 . . . ƒ3).
Referring to FIG. 2, a factor graph of the more complex hidden Markov model (HMM) is illustrated. The factor graph of the HMM can be constructed by extending the Markov chain example of FIG. 1. In particular, an HMM contains a Markov chain transiting from state ai to a1+1. Each state has an observation bi associated with it. Given ai, then bi is conditionally independent of all other variables. This probability can be incorporated by using a factor gi(ai)=P(bi|ai).
The product of the factors is:
                                                        f              0                        (                                          ∏                                  j                  =                  1                                                  n                  -                  1                                            ⁢                                                f                  j                                ⁡                                  (                                                            a                      j                                        ,                                          a                                              j                        +                        1                                                                              )                                                      )                    ⁢                                    ∏                              j                =                1                            n                        ⁢                                          g                j                            ⁡                              (                                  a                  j                                )                                                    =                ⁢                              Pr            ⁡                          (                              a                1                            )                                ⁢                      (                                          ∏                                  j                  =                  1                                                  n                  -                  1                                            ⁢                              Pr                ⁡                                  (                                                            a                                              j                        +                        1                                                              ❘                                          a                      j                                                        )                                                      )                                                          ⁢                              ∏                          j              =              1                        n                    ⁢                      Pr            ⁡                          (                                                b                  j                                ❘                                  a                  j                                            )                                                              =                ⁢                  Pr          ⁡                      (                                          a                1                            ,              …              ⁢                                                          ,                              a                n                            ,                              b                1                            ,              …              ⁢                                                          ,                              b                n                                      )                              
Since the bi are observed, then Pr(b1, . . . , bn) is a constant. Therefore,
      ∝                                                      Pr              ⁡                              (                                                      a                    1                                    ,                  …                  ⁢                                                                          ,                                      a                    n                                    ,                                      b                    1                                    ,                  …                  ⁢                                                                          ,                                      b                    n                                                  )                                                                                        Pr              ⁡                              (                                                      a                    1                                    ,                  …                  ⁢                                                                          ,                                      a                    n                                    ,                                      b                    1                                    ,                  …                  ⁢                                                                          ,                                      b                    n                                                  )                                                                Pr        ⁡                  (                                    b              1                        ,            …            ⁢                                                  ,                          b              n                                )                      =      Pr    ⁡          (                        a          1                ,        …        ⁢                                  ,                              a            n                    ❘                      b            1                          ,        …        ⁢                                  ,                  b          n                    )      as desired.
Note that the variables bi need not appear explicitly in the factor graph. Their effects can be incorporated into the gi factors.
Generalizing from a Markov chain to an HMM illustrates a very powerful feature of factor graphs. In particular, complicated mathematical models are often composed of simpler parts. When these models are expressed as factor graphs, simpler factor graphs can be used to construct more complicated factor graphs.
In some hardware implementations, the variable (equals) nodes of the factor graphs are implemented by equals processors, which output the sum of their inputs, and the factor nodes of the factor graphs are implemented by XOR processors, which output the XOR of their inputs. Both kinds of processors can include both analog and digital circuitry.
When designing circuitry for realizing a belief propagation network, it is common to first work with a logical representation of the network (e.g., a factor graph), rather than the actual circuit elements themselves.
Once the logical design of the network has been settled on, there remains the task of translating that logical design into an actual physical circuit. This typically involves having a hardware architect create a representation of the physical structure in a hardware descriptor language. A floor-planner uses this representation to determine where to place the circuit elements. The output of the floor planner and the representation of the physical design are then provided to a fabrication machine, which creates a physical template to be used in manufacturing the physical circuit corresponding to the logical design.