Sequential quadratic programming (SQP) is a gradient-based optimization routine based upon the method of Lagrange Multiplers. Lagrange multipliers are used to incorporate (nonlinear and linear) equality and inequality constraints. However, the goal of this writeup is not to discuss what SQP is, but rather how to use it in Octave.
Nonlinear Minimization in Octave
SQP is implemented as sqp in Octave. Entering help sqp at the terminal yields:
— Function File: [X, OBJ, INFO, ITER, NF, LAMBDA] = sqp (X, PHI, G, H)
Solve the nonlinear program
min phi (x)
g(x) = 0
h(x) >= 0
using a successive quadratic programming method.
The first argument is the initial guess for the vector X.
The second argument is a function handle pointing to the objective
function. The objective function must be of the form
y = phi (x)
in which X is a vector and Y is a scalar.
The second argument may also be a 2- or 3-element cell array of
function handles. The first element should point to the objective
function, the second should point to a function that computes the
gradient of the objective function, and the third should point to a
function to compute the hessian of the objective function. If the
gradient function is not supplied, the gradient is computed by
finite differences. If the hessian function is not supplied, a
BFGS update formula is used to approximate the hessian.
If supplied, the gradient function must be of the form
g = gradient (x)
in which X is a vector and G is a vector.
If supplied, the hessian function must be of the form
h = hessian (x)
in which X is a vector and H is a matrix.
The third and fourth arguments are function handles pointing to
functions that compute the equality constraints and the inequality
If your problem does not have equality (or inequality) constraints,
you may pass an empty matrix for CEF (or CIF).
If supplied, the equality and inequality constraint functions must
be of the form
r = f (x)
in which X is a vector and R is a vector.
The third and fourth arguments may also be 2-element cell arrays of
function handles. The first element should point to the constraint
function and the second should point to a function that computes
the gradient of the constraint function:
[ d f(x) d f(x) d f(x) ]
transpose ( [ —— —– … —— ] )
[ dx_1 dx_2 dx_N ]
Here is an example of calling `sqp’:
function r = g (x)
r = [ sumsq(x)-10;
function obj = phi (x)
obj = exp(prod(x)) – 0.5*(x(1)^3+x(2)^3+1)^2;
x0 = [-1.8; 1.7; 1.9; -0.8; -0.8];
[x, obj, info, iter, nf, lambda] = sqp (x0, @phi, @g, )
obj = 0.053950
info = 101
iter = 8
nf = 10
The value returned in INFO may be one of the following:
The algorithm terminated because the norm of the last step
was less than `tol * norm (x))’ (the value of tol is
currently fixed at `sqrt (eps)’–edit `sqp.m’ to modify this
The BFGS update failed.
The maximum number of iterations was reached (the maximum
number of allowed iterations is currently fixed at 100–edit
`sqp.m’ to increase this value).
See also: qp.
The implementation is fairly detailed. Let’s use sqp in an example. Consider
We know that the minimum of the paraboloid is at (x,y)=(0,0). In fact, f(x,y) is plotted as
sqp requires an initial estimate to the minimizer x_0, a function somewhere (obj_fcn1.m, we call it) that takes as input value x in R^2 and maps to the non-negative reals. Further, SQP requires some sort of constraint. We only need know (which I’m using loosely) that x_f, the final solution, need be positive. The fourth argument sqp takes is a function (we call it bndfcn.m) that describes the bounds in terms of an output vector with each component positive. For us, literally, the bound function is
since we want each value of the 2×1 vector x to be positive. We leave the third entry for sqp blank since there are no equality constraints. Just to clarify, out objective function, obj_fcn1.m is given as
Our call to sqp is
where our initial iterate to the solution is x_0=[10;10]. Running this program, yields the final results x_f=[0;0]; which is exactly the minimizer to f(x,y)=x^2+y^2.