The segmentation of motion flows within dense crowds of pedestrians in videos is an essential tool in crowd safety and crowd control applications. Videos of crowded scenes can exhibit complex crowd behaviors even under normal situations. For example, crowd flows in large congested areas, such as train stations, can initially appear chaotic. However, is often the case that low dimensional dynamical structures exist in the flow that is desirable to identify and segment from unstructured flows. Moreover, the automatic segmentation of independent crowd flows aids in the monitoring and prediction of hazardous situations in crowded environments.
Of particular interest is the detection and estimation of crowd flows using motion information extracted from the videos. Motion vectors can be determined using optical flow estimation applied to texture, that is pixel intensities, in videos, or the motion vectors can be extracted directly from a bitstream. The bitstream can encoded using any of the well known coding standards, e.g., MPEG, H.264, HEVC, etc.
Considering pedestrians in a crowded scene as particles in a flow, the motion vectors of a video frame correspond to observations of the velocities of particles at a time instance in the flow. Processing motion vectors, instead of the video texture, protects the privacy of individuals observed in surveillance videos.
U.S. Pat. No. 8,773,536 discloses a method for detecting independent motion in a surveillance video. The method constructs a linear system from macroblocks of the video by comparing texture gradients with motion vector. Independent flows are detected when the motion is flagged as a statistical outlier relative to the linear system.
U.S. Pat. No. 8,358,806 discloses a method for crowd segmentation in a video using shape indexing. Background differencing is performed on the video to identify a foreground silhouette shape. Approximate numbers and positions of people are determined by matching the foreground silhouette shape against a set of predetermined foreground silhouette shapes.
Dynamical System Modeling
When the density of a crowd of pedestrians is high, the motion of individuals in the crowd can be modeled as a fluid flow. One such model that is commonly used for crowd analysis is the Hughes model, see: Hughes, “A continuum theory for the flow of pedestrians,” Transportation Research Part B: Methodological, Volume 36, Issue 6, Pages 507-535, July 2002.
Hughes models a crowd flow as a function of density ρ(x, y, t), and velocities (u(x, y, t) and v(x, y, t)) as
                                                                                          ∂                                                                                                          ∂                  t                                            ⁢                              ρ                ⁡                                  (                                      x                    ,                    y                    ,                    t                                    )                                                      +                                                            ∂                                                                                                          ∂                  x                                            ⁢                              (                                                      ρ                    ⁡                                          (                                              x                        ,                        y                        ,                        t                                            )                                                        ⁢                                      u                    ⁡                                          (                                              x                        ,                        y                        ,                        t                                            )                                                                      )                                      +                                                            ∂                                                                                                          ∂                  y                                            ⁢                              (                                                      ρ                    ⁡                                          (                                              x                        ,                        y                        ,                        t                                            )                                                        ⁢                                      v                    ⁡                                          (                                              x                        ,                        y                        ,                        t                                            )                                                                      )                                              =          0                ,                            (        1        )            where u(x, y, t) and v(x, y, t) are the respective velocities in the horizontal and vertical directions of every spatial point (x, y) and time t. A Greenshields model can also be used to relate the density and velocity fields in crowd modeling. The Greenshields model is
                                          u            ⁡                          (                              x                ,                y                ,                t                            )                                =                                    u              _                        ⁡                          (                              1                -                                                      ρ                    ⁡                                          (                                              x                        ,                        y                        ,                        t                                            )                                                                            ρ                    _                                                              )                                      ,                                  ⁢                              v            ⁡                          (                              x                ,                y                ,                t                            )                                =                                    v              _                        ⁡                          (                              1                -                                                      ρ                    ⁡                                          (                                              x                        ,                        y                        ,                        t                                            )                                                                            ρ                    _                                                              )                                      ,                            (        2        )            where ū and v are system parameters determining maximal velocities in the horizontal and vertical directions, and ρ is a maximal density in the scene.
The solution to (1) results in a crowd density map ρ, and the velocity fields (u, v) for all (x, y, t) that satisfy given initial and boundary conditions. Although the dimensionality of the differential equations governing the evolution of ρ and (u, v) can be infinite, it is often the case that the flows exhibit low dimensional behavior.
A low dimensional state variable at time t is x(t), for which an observable vector y(t)=G(x(t)) corresponds to stacking of the density and velocity fields for all positions x and y at time t. The function G is a mapping from the low dimensional manifold on which x evolves to a space of observables. Then, the solution to (1) determines the transient response and stability of the corresponding dynamical system generally characterized by{dot over (x)}(t)=F(x(t)),  (3)where F(•) is some mapping in the low dimensional manifold on which the dynamical system evolves. For discrete time systems, the dynamical system evolution is characterized byxk+1=F(xk),  (4)where k is a time index.
The Koopman Operator and Dynamic Mode Decomposition
The Koopman operator is a linear operator K that satisfiesG(F(xk))=KG(xk)yk+1=Kyk.  (5)
Although this dynamical system is nonlinear and evolves a finite dimensional manifold, the Koopman operator is linear and infinitely dimensional. Spectral analysis of the Koopman operator can be used to decompose the flow in terms of Koopman modes and associated Koopman eigenvalues that determine the temporal behavior of the corresponding Koopman mode.
A dynamic mode decomposition (DMD) can be used to estimate the Koopman modes. The DMD has been used in fluid dynamics as a data-driven and equation-free method for identifying system dynamics. Consider data matrices
                                          Y            1                    =                      [                                                            |                                                  |                                                                                                                                          |                                                                                                  y                    0                                                                                        y                    1                                                                    …                                                                      y                                          m                      -                      1                                                                                                                    |                                                  |                                                                                                                                          |                                                      ]                          ;                              Y            2                    =                                    [                                                                    |                                                        |                                                                                                                                                          |                                                                                                              y                      1                                                                                                  y                      2                                                                            …                                                                              y                      m                                                                                                            |                                                        |                                                                                                                                                          |                                                              ]                        .                                              (        6        )            
The DMD determine the best fit matrix K that satisfies the relationY2≈KY1.  (7)
The eigenvectors and eigenvalues of K approximate the Koopman modes and Koopman eigenvalues. Herein, the terms Koopman modes and DMD modes are used interchangeably.