Intermediate II: stopping rules
The theory of SDDP tells us that the algorithm converges to an optimal policy almost surely in a finite number of iterations. In practice, this number is very large. Therefore, we need some way of pre-emptively terminating SDDP when the solution is “good enough.” We call heuristics for pre-emptively terminating SDDP stopping rules.
Basic limits
The training of an SDDP policy can be terminated after a fixed number of iterations using the iteration_limit
keyword.
Kokako.train(model, iteration_limit = 10)
The training of an SDDP policy can be terminated after a fixed number of seconds using the time_limit
keyword.
Kokako.train(model, time_limit = 2.0)
Stopping rules
In addition to the limits provided as keyword arguments, a variety of other stopping rules are available. These can be passed to Kokako.train
as a vector to the stopping_rules
keyword. For example:
Kokako.train(model, stopping_rules = [Kokako.BoundStalling(10, 1e-4)])
Here are the stopping rules implemented in SDDP.jl
:
Kokako.IterationLimit
— Type.IterationLimit(limit::Int)
Teriminate the algorithm after limit
number of iterations.
Kokako.TimeLimit
— Type.TimeLimit(limit::Float64)
Teriminate the algorithm after limit
seconds of computation.
Kokako.Statistical
— Type.Statistical(; num_replications, iteration_period = 1, z_score = 1.96,
verbose = true)
Perform an in-sample Monte Carlo simulation of the policy with num_replications
replications every iteration_period
s. Terminate if the deterministic bound (lower if minimizing) calls into the confidence interval for the mean of the simulated cost. If verbose = true
, print the confidence interval.
Note that this tests assumes that the simulated values are normally distributed. In infinite horizon models, this is almost never the case. The distribution is usually closer to exponential or log-normal.
Kokako.BoundStalling
— Type.BoundStalling(num_previous_iterations::Int, tolerance::Float64)
Teriminate the algorithm once the deterministic bound (lower if minimizing, upper if maximizing) fails to improve by more than tolerance
in absolute terms for more than num_previous_iterations
consecutve iterations.
In the next tutorial, Intermediate III: policy graphs, we discuss generic extensions to Kokako.LinearPolicyGraph
and Kokako.MarkovianPolicyGraph
.