Maximum Likelihood Estimation
Multistart Optimization
When a reasonable estimate for the initial parameter configuration is not available and optimizations starting from the approximate center of the parameter domain do not converge to a suitable optimum, a systematic search of the parameter space is needed. This can be achieved with the MultistartFit
method, which will sample the parameter space $\mathcal{M}$ either according to a given probability distribution (e.g. uniform / normal) on $\mathcal{M}$ or by drawing the parameter values from a low-discrepancy sequence such as the Sobol sequence. Compared with uniformly drawn values, values drawn from low-discrepancy sequences achieve a more even and "equidistant" coverage, thereby slightly increasing the chances of discovering a larger number of distinct local optima overall.
Just like InformationGeometry.minimize
, the MultistartFit
method is also compatible with the Optimization.jl ecosystem of optimizer methods via the meth
keyword.
R = MultistartFit(DM)
The most relevant keywords are:
MultistartDomain::HyperCube
for defining the domain from which the initial guesses are drawnN::Int
for choosing the number of startsmeth
for choosing the optimizerresampling=false
disables the drawing of new guesses when the objective function cannot be evaluated in some regions of the parameter spacemaxval::Real=1e5
if noMultistartDomain
is specified, a cube of size(-maxval, maxval)^(×n)
is constructed
Returns a MultistartResults
object, which saves some additional information such as the number of iterations taken per run, the initial guesses, etc. The parameter configuration is obtained with MLE(R)
.
The results of a multistart optimization can be visualized with a plot of the sorted final objective values, ideally showing prominent step-like features where the optimization reliably converged to the same local optima multiple times.
WaterfallPlot(R)
It can also be useful to investigate whether the parameter configurations within one "step" of the plot, where the final objective function value was the same, are actually "close" to each other, or whether there are distinct (or spread out) local optima, which nevertheless produce a fit of the same quality. This can be plotted via the ParameterPlot
method, which can display the data either in a :dotplot
, :boxplot
or :violin
plot.
using StatsPlots
ParameterPlot(R)
Seeing a spread in the ParameterPlot
for the parameter configurations of the lowest "step" is already an indication that some parameter may be non-identifiable.
InformationGeometry.MultistartFit
— FunctionMultistartFit(DM::AbstractDataModel; maxval::Real=1e5, MultistartDomain::HyperCube=FullDomain(pdim(DM), maxval), kwargs...)
MultistartFit(DS::AbstractDataSet, model::ModelOrFunction, LogPriorFn::Union{Nothing,Function}, MultistartDomain::HyperCube; N::Int=100, resampling::Bool=true, timeout::Real=120, Full=true, parallel::Bool=true, Robust::Bool=true, p::Real=2, kwargs...)
Performs Multistart optimization with N
starts and timeout of fits after timeout
seconds. If resampling=true
, if likelihood non-finite new initial starts are redrawn until N
suitable initials are found. If Robust=true
, performs optimization wrt. p-norm according to given kwarg p
. For Full=false
, only the final MLE is returned, otherwise a MultistartResults
object is returned, which can be further analyzed and plotted.
Any further keyword arguments are passed through to the optimization procedure InformationGeometry.minimize such as tolerances, optimization methods, domain constraints, etc.
InformationGeometry.LocalMultistartFit
— FunctionLocalMultistartFit(DM::AbstractDataModel, scale::Real=sqrt(InvChisqCDF(DOF(DM), ConfVol(2.0))); kwargs...)
LocalMultistartFit(DM::AbstractDataModel, mle::AbstractVector{<:Number}, scale::Real=sqrt(InvChisqCDF(DOF(DM), ConfVol(2.0))); kwargs...)
Performs a multistart search locally around a given MLE in a cube constructed from the Fisher information and multiplied with scale
.
InformationGeometry.WaterfallPlot
— FunctionWaterfallPlot(R::MultistartResults; BiLog::Bool=true, MaxValue::Real=3000, StepTol::Real=0.01, kwargs...)
Shows Waterfall plot for the given results of MultistartFit. StepTol
is used to decide which difference of two neighbouring values in the Waterfall plot constitutes a step. StepTol=0
deactivates step marks. MaxValue
is used to set threshold for ignoring points whose cost function after optimization is too large compared with best optimum. DoBiLog=false
disables logarithmic scale for cost function.
InformationGeometry.ParameterPlot
— MethodParameterPlot(R::MultistartResults; st=:dotplot, BiLog::Bool=true, Nsteps::Int=5, StepTol::Real=0.01, MaxValue=3000)
Plots the parameter values of the MultistartResults
separated by step to show whether the different optima are localized or not. st
can be either :dotplot
, :boxplot
or :violin
.
StatsPlots.jl needs to be loaded to use this plotting function.
InformationGeometry.IncrementalTimeSeriesFit
— MethodIncrementalTimeSeriesFit(DM::AbstractDataModel, initial::AbstractVector{<:Number}=MLE(DM); steps::Int=length(Data(DM))÷5, Method::Function=InformationGeometry.minimize, kwargs...) -> Vector
Fits DataModel incrementally, which can yield better results, particularly for autocorrelated time series data.
InformationGeometry.RobustFit
— FunctionRobustFit(DM::AbstractDataModel, start::AbstractVector{<:Number}; tol::Real=1e-10, p::Real=1, kwargs...)
Uses p
-Norm to judge distance on Dataspace as specified by the keyword.