Maximum Likelihood Estimation

Multistart Optimization

When a reasonable estimate for the initial parameter configuration is not available and optimizations starting from the approximate center of the parameter domain do not converge to a suitable optimum, a systematic search of the parameter space is needed. This can be achieved with the MultistartFit method, which will sample the parameter space $\mathcal{M}$ either according to a given probability distribution (e.g. uniform / normal) on $\mathcal{M}$ or by drawing the parameter values from a low-discrepancy sequence such as the Sobol sequence. Compared with uniformly drawn values, values drawn from low-discrepancy sequences achieve a more even and "equidistant" coverage, thereby slightly increasing the chances of discovering a larger number of distinct local optima overall.

Just like InformationGeometry.minimize, the MultistartFit method is also compatible with the Optimization.jl ecosystem of optimizer methods via the meth keyword.

R = MultistartFit(DM)

The most relevant keywords are:

  • MultistartDomain::HyperCube for defining the domain from which the initial guesses are drawn
  • N::Int for choosing the number of starts
  • meth for choosing the optimizer
  • resampling=false disables the drawing of new guesses when the objective function cannot be evaluated in some regions of the parameter space
  • maxval::Real=1e5 if no MultistartDomain is specified, a cube of size (-maxval, maxval)^(×n) is constructed

Returns a MultistartResults object, which saves some additional information such as the number of iterations taken per run, the initial guesses, etc. The parameter configuration is obtained with MLE(R).

The results of a multistart optimization can be visualized with a plot of the sorted final objective values, ideally showing prominent step-like features where the optimization reliably converged to the same local optima multiple times.

WaterfallPlot(R)

It can also be useful to investigate whether the parameter configurations within one "step" of the plot, where the final objective function value was the same, are actually "close" to each other, or whether there are distinct (or spread out) local optima, which nevertheless produce a fit of the same quality. This can be plotted via the ParameterPlot method, which can display the data either in a :dotplot, :boxplot or :violin plot.

using StatsPlots
ParameterPlot(R)

Seeing a spread in the ParameterPlot for the parameter configurations of the lowest "step" is already an indication that some parameter may be non-identifiable.

InformationGeometry.MultistartFitFunction
MultistartFit(DM::AbstractDataModel; maxval::Real=1e5, MultistartDomain::HyperCube=FullDomain(pdim(DM), maxval), kwargs...)
MultistartFit(DS::AbstractDataSet, model::ModelOrFunction, LogPriorFn::Union{Nothing,Function}, MultistartDomain::HyperCube; N::Int=100, resampling::Bool=true, timeout::Real=120, Full=true, parallel::Bool=true, Robust::Bool=true, p::Real=2, kwargs...)

Performs Multistart optimization with N starts and timeout of fits after timeout seconds. If resampling=true, if likelihood non-finite new initial starts are redrawn until N suitable initials are found. If Robust=true, performs optimization wrt. p-norm according to given kwarg p. For Full=false, only the final MLE is returned, otherwise a MultistartResults object is returned, which can be further analyzed and plotted.

Note

Any further keyword arguments are passed through to the optimization procedure InformationGeometry.minimize such as tolerances, optimization methods, domain constraints, etc.

source
InformationGeometry.LocalMultistartFitFunction
LocalMultistartFit(DM::AbstractDataModel, scale::Real=sqrt(InvChisqCDF(DOF(DM), ConfVol(2.0))); kwargs...)
LocalMultistartFit(DM::AbstractDataModel, mle::AbstractVector{<:Number}, scale::Real=sqrt(InvChisqCDF(DOF(DM), ConfVol(2.0))); kwargs...)

Performs a multistart search locally around a given MLE in a cube constructed from the Fisher information and multiplied with scale.

source
InformationGeometry.WaterfallPlotFunction
WaterfallPlot(R::MultistartResults; BiLog::Bool=true, MaxValue::Real=3000, StepTol::Real=0.01, kwargs...)

Shows Waterfall plot for the given results of MultistartFit. StepTol is used to decide which difference of two neighbouring values in the Waterfall plot constitutes a step. StepTol=0 deactivates step marks. MaxValue is used to set threshold for ignoring points whose cost function after optimization is too large compared with best optimum. DoBiLog=false disables logarithmic scale for cost function.

source
InformationGeometry.ParameterPlotMethod
ParameterPlot(R::MultistartResults; st=:dotplot, BiLog::Bool=true, Nsteps::Int=5, StepTol::Real=0.01, MaxValue=3000)

Plots the parameter values of the MultistartResults separated by step to show whether the different optima are localized or not. st can be either :dotplot, :boxplot or :violin.

Note

StatsPlots.jl needs to be loaded to use this plotting function.

source
InformationGeometry.IncrementalTimeSeriesFitMethod
IncrementalTimeSeriesFit(DM::AbstractDataModel, initial::AbstractVector{<:Number}=MLE(DM); steps::Int=length(Data(DM))÷5, Method::Function=InformationGeometry.minimize, kwargs...) -> Vector

Fits DataModel incrementally, which can yield better results, particularly for autocorrelated time series data.

source
InformationGeometry.RobustFitFunction
RobustFit(DM::AbstractDataModel, start::AbstractVector{<:Number}; tol::Real=1e-10, p::Real=1, kwargs...)

Uses p-Norm to judge distance on Dataspace as specified by the keyword.

source