## Introduction

Sequential quadratic programming (SQP) is a gradient-based optimization routine based upon the method of Lagrange Multiplers. Lagrange multipliers are used to incorporate (nonlinear and linear) equality and inequality constraints. However, the goal of this writeup is not to discuss what SQP is, but rather how to use it in Octave.

## Nonlinear Minimization in Octave

SQP is implemented as *sqp* in Octave. Entering *help sqp* at the terminal yields:

— Function File: [X, OBJ, INFO, ITER, NF, LAMBDA] = sqp (X, PHI, G, H)

Solve the nonlinear programmin phi (x)

xsubject to

g(x) = 0

h(x) >= 0using a successive quadratic programming method.

The first argument is the initial guess for the vector X.

The second argument is a function handle pointing to the objective

function. The objective function must be of the formy = phi (x)

in which X is a vector and Y is a scalar.

The second argument may also be a 2- or 3-element cell array of

function handles. The first element should point to the objective

function, the second should point to a function that computes the

gradient of the objective function, and the third should point to a

function to compute the hessian of the objective function. If the

gradient function is not supplied, the gradient is computed by

finite differences. If the hessian function is not supplied, a

BFGS update formula is used to approximate the hessian.If supplied, the gradient function must be of the form

g = gradient (x)

in which X is a vector and G is a vector.

If supplied, the hessian function must be of the form

h = hessian (x)

in which X is a vector and H is a matrix.

The third and fourth arguments are function handles pointing to

functions that compute the equality constraints and the inequality

constraints, respectively.If your problem does not have equality (or inequality) constraints,

you may pass an empty matrix for CEF (or CIF).If supplied, the equality and inequality constraint functions must

be of the formr = f (x)

in which X is a vector and R is a vector.

The third and fourth arguments may also be 2-element cell arrays of

function handles. The first element should point to the constraint

function and the second should point to a function that computes

the gradient of the constraint function:[ d f(x) d f(x) d f(x) ]

transpose ( [ —— —– … —— ] )

[ dx_1 dx_2 dx_N ]Here is an example of calling `sqp’:

function r = g (x)

r = [ sumsq(x)-10;

x(2)*x(3)-5*x(4)*x(5);

x(1)^3+x(2)^3+1 ];

endfunctionfunction obj = phi (x)

obj = exp(prod(x)) – 0.5*(x(1)^3+x(2)^3+1)^2;

endfunctionx0 = [-1.8; 1.7; 1.9; -0.8; -0.8];

[x, obj, info, iter, nf, lambda] = sqp (x0, @phi, @g, [])

x =

-1.71714

1.59571

1.82725

-0.76364

-0.76364obj = 0.053950

info = 101

iter = 8

nf = 10

lambda =-0.0401627

0.0379578

-0.0052227The value returned in INFO may be one of the following:

101

The algorithm terminated because the norm of the last step

was less than `tol * norm (x))’ (the value of tol is

currently fixed at `sqrt (eps)’–edit `sqp.m’ to modify this

value.102

The BFGS update failed.103

The maximum number of iterations was reached (the maximum

number of allowed iterations is currently fixed at 100–edit

`sqp.m’ to increase this value).See also: qp.

The implementation is fairly detailed. Let’s use *sqp* in an example. Consider

f(x,y)=x^2+y^2.

We know that the minimum of the paraboloid is at (x,y)=(0,0). In fact, f(x,y) is plotted as

*sqp* requires an initial estimate to the minimizer x_0, a function somewhere (obj_fcn1.m, we call it) that takes as input value x in R^2 and maps to the non-negative reals. Further, SQP requires some sort of constraint. We only need know (which I’m using loosely) that x_f, the final solution, need be positive. The fourth argument *sqp* takes is a function (we call it bndfcn.m) that describes the bounds in terms of an output vector with each component positive. For us, literally, the bound function is

`function out=bndfcn(x)`

out=x(:);

since we want each value of the 2×1 vector x to be positive. We leave the third entry for *sqp* blank since there are no equality constraints. Just to clarify, out objective function, obj_fcn1.m is given as

`function out=obj_fcn1(x)`

x=x(:);

out=x'*x;

Our call to *sqp* is

`[x_f,j_f]=sqp(x_0,@(x)obj_fcn1(x),[],@(x)bndfcn(x));`

where our initial iterate to the solution is x_0=[10;10]. Running this program, yields the final results x_f=[0;0]; which is exactly the minimizer to f(x,y)=x^2+y^2.

Oh I think this might be just what I am looking for — thanks for the post! Long live GNU!

Great! Glad to be useful!

Can you please explain the meaning of

” If supplied, the equality and inequality constraint functions must be of the form

r = f (x)

in which X is a vector and R is a vector.”

if vector what is the size of the vector?

Does it mean that CIF & CEF represent several simultaneous constrains?

Thanks in advance,

Sorry for the long delay in response.

First, this is the help in Octave when you type “help sqp”. Consequently, I may not have really great responses. Second, you don’t have to supply those. But, if you do, they must follow that certain form.

The vector X is the current iterate of the vector of parameters you are estimating/minimizing. R is related to that vector by some f (f:x->R). Therefore, R may not necessarily be the same size as your X. However, f represents a mapping of your parameters to your equality constraint space. So the dimension of the range of f is where I’d look.

CEF (‘g’ in the call) means “constraint equality function” and CIF (‘h’ in the call) means “constraint inequality functions”. While I don’t know your exact problem, those two relations could easily describe multiple simultaneous constraints. I know that I’ve used both inequality and equality constraints to describe hundreds of relationships.

Hope this helps!