Dogbox Algorithm

The idea for this simple algorithm is taken from this paper. We use a rectangular trust region, so intersection of a trust region and a rectangular feasible region is again some rectangle. Thus at each iteration we need to solve the following constrained quadratic problem

\displaystyle \min_p m(p) = \frac{1}{2} p^T B p + g^T p \text{, s. t. } \tilde{l} \leq p \leq \tilde{u}

This problem is interesting by itself, and I’m sure there are a lot of theories and methods for solving it. But we are going to use perhaps the simplest approach called “dogleg”. It can be applied when B is positive definite. In this case we compute Newton (Gauss-Newton for least squares) step p^N = -B^{-1} g and so called Cauchy step, which is unconstrained minimizer of our quadratic model along anti-gradient:

\displaystyle p^C = -\frac{g^T g}{g^T B g} g

And define the dogleg path p(\tau) as follows:

p(\tau) = \begin{cases}    \tau p^C & 0 \leq \tau \leq 1 \\    p^C + (\tau - 1) (p^N - p^C) & 1 < \tau \leq 2    \end{cases}

First we move towards Cauchy point and then from Cauchy point to Newton point. It can be proven that m(p(\tau)) is a decreasing function of \tau, which means we should follow the dogleg path as long as we stay in a trust region. For a spherical trust regions there is no more than one intersection of dogleg path and the boundary, which makes the method especially simple and clear. The last statement is not true for a rectangular trust region, but it is still not hard to find the optimum along the dogleg path within a trust region, or we can just stop at the first intersection, a slightly improved strategy is suggested in the paper I linked above.

If during iterations some variable hit the initial boundary (from l and u) and the component of anti-gradient points outside the feasible region, then if we try a dogleg step our algorithm won’t make any progress. At this state such variables satisfy the first-order optimality and we should exclude them before taking a next dogleg step.

When B is positive semidefinite then we don’t have proper Newton step, in this case we should compute regularized Newton step, by increasing diagonal elements of B by a proper amount. As far as I know there is no universal and 100% satisfactory recipe of doing it (but I mentioned a paper in the previous post with some solution).

And that’s basically the whole algorithm. It seems to be a little hackish, from the experience it works adequate in unconstrained least-squares problems when J is not rank deficient. (I haven’t implemented any tweaks for this case so far.) In bounded problems it performance varies, but as it does for Trust Region Reflective.

There is a notion of “combinatorial difficulty of inequality-constrained problems” which burdens methods that try to determine what constraints are active in the optimal solution (active set methods). The dogbox algorithm does something like that, but to my shame I have no idea how it will work in this regard when the number of variables becomes large. On the other hand, Trust Region Reflective is expected to work very well in high dimensional setting, it was mentioned as its strongest point by the authors.

So far I was focusing on the general logic and workflow of each algorithm and tested it on small problems, I will publish the results in the next post.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s