Main Content

Sometimes your objective function or nonlinear constraint function values are available only by simulation or by numerical solution of an ordinary differential equation (ODE). Such optimization problems have several common characteristics and challenges, discussed in Potential Problems and Solutions.

For a problem-based example of optimizing an ODE, see Fit ODE, Problem-Based. For a solver-based example, see Fit an Ordinary Differential Equation (ODE).

To optimize a Simulink^{®} model easily, try using Simulink
Design Optimization™.

Optimization Toolbox™ solvers use derivatives of objective and constraint functions internally. By default, they estimate these derivatives using finite difference approximations of the form

$$\frac{F\left(x+\delta \right)-F\left(x\right)}{\delta}$$

or

$$\frac{F\left(x+\delta \right)-F\left(x-\delta \right)}{2\delta}.$$

These finite difference approximations can be inaccurate because:

A large value of

*δ*allows more nonlinearity to affect the finite difference.A small value of

*δ*leads to inaccuracy due to limited precision in numerics.

Specifically, for simulations and numerical solutions of ODEs:

Simulations are often insensitive to small changes in parameters. This means that if you use too small a perturbation

*δ*, the simulation can return a spurious estimated derivative of 0.Both simulations and numerical solutions of ODEs can have inaccuracies in their function evaluations. These inaccuracies can be amplified in finite difference approximations.

Numerical solution of ODEs introduces noise at values much larger than machine precision. This noise can be amplified in finite difference approximations.

If an ODE solver uses variable step sizes, then sometimes the number of ODE steps in the evaluation of

*F*(*x*+*δ*) can differ from the number of steps in the evaluation of*F*(*x*). This difference can lead to a spurious difference in the returned values, giving a misleading estimate of the derivative.

**Avoid Finite Differences by Using Direct Search. **If you have a Global Optimization Toolbox license, you can
try using `patternsearch`

(Global Optimization Toolbox) as your solver. `patternsearch`

does
not attempt to estimate gradients, so does not suffer from the limitations
in Problems in Finite Differences.

If you use `patternsearch`

for expensive
(time-consuming) function evaluations, use the `Cache`

option:

options = optimoptions('patternsearch','Cache','on');

If you cannot use `patternsearch`

, and have
a relatively low-dimensional unconstrained minimization problem, try `fminsearch`

instead. `fminsearch`

does
not use finite differences. However, `fminsearch`

is
not a fast or tunable solver.

**Set Larger Finite Differences. **You can sometimes avoid the problems in Problems in Finite Differences by taking larger finite difference steps
than the default.

If you have MATLAB

^{®}R2011b or later, set a finite difference step size option to a value larger than the default`sqrt(eps)`

or`eps^(1/3)`

, such as:For R2011b–R2012b:

`options = optimset('FinDiffRelStep',1e-3);`

For R2013a–R2015b and a solver named

`'solvername'`

:options = optimoptions('solvername','FinDiffRelStep',1e-3);

For R2016a onwards and a solver named

`'solvername'`

:options = optimoptions('solvername','FiniteDifferenceStepSize',1e-3);

If you have different scales in different components, set the finite difference step size to a vector proportional to the component scales.

If you have MATLAB R2011a or earlier, set the

`DiffMinChange`

option to a larger value than the default`1e-8`

, and possibly set the`DiffMaxChange`

option also, such as:options = optimset('DiffMinChange',1e-3,'DiffMaxChange',1);

**Note**

It is difficult to know how to set these finite difference sizes.

You can also try setting central finite differences:

options = optimoptions('solvername','FiniteDifferenceType','central');

**Use a Gradient Evaluation Function. **To avoid the problems of finite difference estimation, you can
give an approximate gradient function in your objective and nonlinear
constraints. Remember to set the `SpecifyObjectiveGradient`

option
to `true`

using `optimoptions`

,
and, if relevant, also set the `SpecifyConstraintGradient`

option
to `true`

.

For some ODEs, you can evaluate the gradient numerically at the same time as you solve the ODE. For example, suppose the differential equation for your objective function

*z*(*t*,*x*) is$$\frac{d}{dt}z(t,x)=G(z,t,x),$$

where

*x*is the vector of parameters over which you minimize. Suppose*x*is a scalar. Then the differential equation for its derivative*y*,$$y(t,x)=\frac{d}{dx}z(t,x)$$

is

$$\frac{d}{dt}y(t,x)=\frac{\partial G(z,t,x)}{\partial z}y(t,x)+\frac{\partial G(z,t,x)}{\partial x},$$

where

*z*(*t*,*x*) is the solution of the objective function ODE. You can solve for*y*(*t*,*x*) in the same system of differential equations as*z*(*t*,*x*). This solution gives you an approximated derivative without ever taking finite differences. For nonscalar*x*, solve one ODE per component.For theoretical and computational aspects of this method, see Leis and Kramer [2]. For computational experience with this and finite-difference methods, see Figure 7 of Raue et al. [3].

For some simulations, you can estimate a derivative within the simulation. For example, the likelihood ratio technique described in Reiman and Weiss [4] or the infinitesimal perturbation analysis technique analyzed in Heidelberger, Cao, Zazanis, and Suri [1] estimate derivatives in the same simulation that estimates the objective or constraint functions.

**Use Tighter ODE Tolerances. **You can use `odeset`

to
set the `AbsTol`

or `RelTol`

ODE
solver tolerances to values below their defaults. However, choosing
too small a tolerance can lead to slow solutions, convergence failure,
or other problems. Never choose a tolerance less than `1e-9`

for `RelTol`

.
The lower limit on each component of `AbsTol`

depends
on the scale of your problem, so there is no advice.

If a simulation uses random numbers, then evaluating an objective
or constraint function twice can return different results. This affects
both function estimation and finite difference estimation. The value
of a finite difference might be dominated by the variation due to
randomness, rather than the variation due to different evaluation
points *x* and *x* + *δ*.

If your simulation uses random numbers from a stream you control, reset the random stream before each evaluation of your objective or constraint functions. This practice can reduce the variability in results. For example, in an objective function:

function f = mysimulation(x) rng default % or any other resetting method ... end

For details, see Generate Random Numbers That Are Repeatable.

Frequently, a simulation evaluates both the objective function and constraints
during the same simulation run. Or, both objective and nonlinear constraint
functions use the same expensive computation. Solvers such as
`fmincon`

separately evaluate the objective function and
nonlinear constraint functions. This can lead to a great loss of efficiency,
because the solver calls the expensive computation twice. To circumvent this
problem, use the technique in Objective and Nonlinear Constraints in the Same Function, or, when
using the problem-based approach, Objective and Constraints Having a Common Function in Serial or Parallel, Problem-Based.

Your simulation or ODE can fail for some parameter values.

**Set Appropriate Bounds. **While you might not know all limitations on the parameter space,
try to set appropriate bounds on all parameters, both upper and lower.
This can speed up your optimization, and can help the solver avoid
problematic parameter values.

**Use a Solver That Respects Bounds. **As described in Iterations Can Violate Constraints, some algorithms can violate
bound constraints at intermediate iterations. For optimizing simulations
and ODEs, use algorithms that always obey bound constraints. See Algorithms That Satisfy Bound Constraints.

**Return NaN. **If your simulation or ODE solver does not successfully evaluate
an objective or nonlinear constraint function at a point *x*,
have your function return `NaN`

. Most Optimization Toolbox and Global Optimization Toolbox solvers
have the robustness to attempt a different iterative step if they
encounter a `NaN`

value. These robust solvers include:

`fmincon`

`interior-point`

,`sqp`

, and`trust-region-reflective`

algorithms`fminunc`

`lsqcurvefit`

`lsqnonlin`

`patternsearch`

Some people are tempted to return an arbitrary large objective
function value at an unsuccessful, infeasible, or other poor point.
However, this practice can confuse a solver, because the solver does
not realize that the returned value is arbitrary. When you return `NaN`

,
the solver can attempt to evaluate at a different point.

[1] Heidelberger, P., X.-R. Cao, M. A. Zazanis, and R. Suri.
*Convergence properties of infinitesimal perturbation analysis
estimates*. Management Science 34, No.
11, pp. 1281–1302, 1988.

[2] Leis, J. R. and Kramer, M.A. *The
Simultaneous Solution and Sensitivity Analysis of Systems Described
by Ordinary Differential Equations.* ACM Trans. Mathematical
Software, Vol. 14, No. 1, pp. 45–60, 1988.

[3] Raue, A. et al. *Lessons Learned from Quantitative Dynamical
Modeling in Systems Biology.* Available at `https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0074335`

,
2013.

[4] Reiman, M. I. and A. Weiss. *Sensitivity analysis via
likelihood ratios*. Proc. 18th Winter
Simulation Conference, ACM, New York, pp. 285–289,
1986.