Class BFGS
The BFGS method belongs to quasi-Newton methods, a class of hill-climbing optimization techniques that seek a stationary point of a (preferably twice continuously differentiable) function. For such problems, a necessary condition for optimality is that the gradient be zero. Newton's method and the BFGS methods are not guaranteed to converge unless the function has a quadratic Taylor expansion near an optimum. However, BFGS has proven to have good performance even for non-smooth optimizations.
In quasi-Newton methods, the Hessian matrix of second derivatives is not computed. Instead, the Hessian matrix is approximated using updates specified by gradient evaluations (or approximate gradient evaluations). Quasi-Newton methods are generalizations of the secant method to find the root of the first derivative for multidimensional problems. In multidimensional problems, the secant equation does not specify a unique solution, and quasi-Newton methods differ in how they constrain the solution. The BFGS method is one of the most popular members of this class.
Like the original BFGS, the limited-memory BFGS (L-BFGS) uses an
estimation to the inverse Hessian matrix to steer its search
through variable space, but where BFGS stores a dense n × n
approximation to the inverse Hessian (n
being the number of
variables in the problem), L-BFGS stores only a few vectors
that represent the approximation implicitly. Due to its resulting
linear memory requirement, the L-BFGS method is particularly well
suited for optimization problems with a large number of variables
(e.g., > 1000
). Instead of the inverse Hessian H_k
, L-BFGS
maintains * a history of the past m
updates of the position
x
and gradient ∇f(x)
, where generally the
history size m
can be small (often m < 10
).
These updates are used to implicitly do operations requiring the
H_k
-vector product.
References
- Roger Fletcher. Practical methods of optimization.
- D. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical Programming B 45 (1989) 503-528.
- Richard H. Byrd, Peihuang Lu, Jorge Nocedal and Ciyou Zhu. A limited memory algorithm for bound constrained optimization.
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic double
minimize
(DifferentiableMultivariateFunction func, double[] x, double gtol, int maxIter) This method solves the unconstrained minimization problemstatic double
minimize
(DifferentiableMultivariateFunction func, int m, double[] x, double[] l, double[] u, double gtol, int maxIter) This method solves the bound constrained minimization problem using the L-BFGS-B method.static double
minimize
(DifferentiableMultivariateFunction func, int m, double[] x, double gtol, int maxIter) This method solves the unconstrained minimization problem
-
Constructor Details
-
BFGS
public BFGS()
-
-
Method Details
-
minimize
public static double minimize(DifferentiableMultivariateFunction func, double[] x, double gtol, int maxIter) This method solves the unconstrained minimization problemmin f(x), x = (x1,x2,...,x_n),
using the BFGS method.- Parameters:
func
- the function to be minimized.x
- on initial entry this must be set by the user to the values of the initial estimate of the solution vector. On exit, it contains the values of the variables at the best point found (usually a solution).gtol
- the convergence tolerance on zeroing the gradient.maxIter
- the maximum number of iterations.- Returns:
- the minimum value of the function.
-
minimize
public static double minimize(DifferentiableMultivariateFunction func, int m, double[] x, double gtol, int maxIter) This method solves the unconstrained minimization problemmin f(x), x = (x1,x2,...,x_n),
using the limited-memory BFGS method. The method is especially effective on problems involving a large number of variables. In a typical iteration of this method an approximationHk
to the inverse of the Hessian is obtained by applyingm
BFGS updates to a diagonal matrixHk0
, using information from the previous M steps. The user specifies the numberm
, which determines the amount of storage required by the routine.- Parameters:
func
- the function to be minimized.m
- the number of corrections used in the L-BFGS update. Values ofm
less than 3 are not recommended; large values ofm
will result in excessive computing time.3 <= m <= 7
is recommended. A common choice for m is m = 5.x
- on initial entry this must be set by the user to the values of the initial estimate of the solution vector. On exit withiflag = 0
, it contains the values of the variables at the best point found (usually a solution).gtol
- the convergence tolerance on zeroing the gradient.maxIter
- the maximum number of iterations.- Returns:
- the minimum value of the function.
-
minimize
public static double minimize(DifferentiableMultivariateFunction func, int m, double[] x, double[] l, double[] u, double gtol, int maxIter) This method solves the bound constrained minimization problem using the L-BFGS-B method. The L-BFGS-B algorithm extends L-BFGS to handle simple box constraints on variables; that is, constraints of the form li ≤ xi ≤ ui where li and ui are per-variable constant lower and upper bounds, respectively (for each xi, either or both bounds may be omitted). The method works by identifying fixed and free variables at every step (using a simple gradient method), and then using the L-BFGS method on the free variables only to get higher accuracy, and then repeating the process.- Parameters:
func
- the function to be minimized.m
- the number of corrections used in the L-BFGS update. Values ofm
less than 3 are not recommended; large values ofm
will result in excessive computing time.3 <= m <= 7
is recommended. A common choice for m is m = 5.x
- on initial entry this must be set by the user to the values of the initial estimate of the solution vector. On exit withiflag = 0
, it contains the values of the variables at the best point found (usually a solution).l
- the lower bound.u
- the upper bound.gtol
- the convergence tolerance on zeroing the gradient.maxIter
- the maximum number of iterations.- Returns:
- the minimum value of the function.
-