OPTIMIZATION
Click the Run button below when the applet has completed loading.
A project explaining the optimization techniques
is also available.
This applet illustrates the calculation of maximum likelihood estimates
for the parameters of a Normal(mu,sigma2) distribution. The applet
generates a set (100 observations) of random numbers from a Normal
distribution. It then draws a contour plot of the log-likelihood
surface. In addition, profile likelihoods are drawn for each of the
two parameters. The variance is parameterized using the log
transformation to avoid possible boundary problems.
The profile likelihood for mu is calculated constraining the
log(sigma2) parameter at the maximum likelihood value while varying
mu. Likewise, the profile likelihood for log(sigma2) is calculated
constraining the mu parameter at the maximum likelihood value while
varying the log(sigma2) parameter.
Click on the log-likelihood surface to indicate a starting position
for the estimation. Then, there are several techniques that can
be used to update the estimated parameter vector. Some techniques
will almost always jump to the final solution in a single step while
others will take several steps to converge to the solution (marked
with a blue line in the horizontal and vertical directions).
Estimation can be restarted with different starting values at any
time by clicking on the contour plot.
When using Newton-Raphson or Fisher scoring, you can also try to
optimize the stepsize. With these techniques, optimizing the stepsize
overcomes bad steps where the algorithm attempts to step too far or not
far enough toward the solution. Regardless, the methods also employ
(automatically) Marquardt's modification such that the diagonals of the
negative Hessian are increased if the matrix is singular.
Optimizing stepsize has no effect
on the other techniques. Rather, the other techniques always
optimize the stepsize.
You will likely see no difference in the other 4 methods. For this
particular problem, they should all jump to the final estimate in
one step.
Several statistics are shown for the optimization process. The main
difference in the techniques is the calculation of the negative H
matrix. Note that this matrix is included in the output. If
optimzing the stepsize, the multiplicative factor of the stepsize
is included in the output.
The mu parameter is labeled beta_0 and the log(sigma2) parameter is
labeled beta_1 in the output.