Known Wiberg minimization minimizes a function of two sets of variables, which is linear in at least one of the sets of variables. Examples of a function to be minimized include (1) the L1 norm of a vector function; (2) the L2 norm of a vector function; and (3) the negative likelihood of a set of observations. The disclosure generalizes Wiberg minimization to functions that are nonlinear in both sets of variables.
Wiberg demonstrated an L2 factorization method for matrices with missing data that solved for one set of variables V in terms of the other set of variables U. It linearizes V about U and then minimizes with respect to U only. Subsequent work showed Wiberg's method had better convergence than using Levenberg-Marquardt to minimize with respect to U and V simultaneously. It has also been shown that Wiberg's approach may be adapted for L1 matrix factorization using linear programming. This method outperformed an alternating convex programming method for L1 factorization, establishing a new state-of-the-art technique for solving that type of problem.
Wiberg's L2 method is an application of more general work on separable nonlinear minimization that uses the idea of solving for V in terms of U and then minimizing with respect to U only. A high-level approach to minimizing a general function in this manner has been described where V breaks down into small independent problems given U. But this approach focused on the specific case of nonlinear least squares, where the objective is linear in V. A separable method for maximum likelihood estimation for a nonlinear least squares problem linear in V has also been previously described. The Wiberg approach contrasts with methods that simply alternate between determining one set of unknowns while holding the other fixed. While these alternating methods can sometimes converge well, they can also fail to converge catastrophically and do not converge quadratically like second-order methods that minimize with respect to all of the variables simultaneously.
The Wiberg approach to matrix factorization breaks a matrix Y into low-rank factors U and V by solving for V given U in closed form, linearizing V(U) about U, and iteratively minimizing ∥Y−UV(U)∥2 with respect to U only. The Wiberg approach optimizes the same objective while effectively removing V from the minimization, that is, Wiberg minimizes a function with respect to one set of variables only. Recently, this approach has been extended to L1, minimizing ∥Y−UV(U)∥1. This L1-Wiberg approach outperformed the previous state-of-the-art for L1 factorization.