OPTIMIZATION


Click the Run button below when the applet has completed loading.

A project explaining the optimization techniques is also available.



This applet illustrates the calculation of maximum likelihood estimates for the parameters of a Normal(mu,sigma2) distribution. The applet generates a set (100 observations) of random numbers from a Normal distribution. It then draws a contour plot of the log-likelihood surface. In addition, profile likelihoods are drawn for each of the two parameters. The variance is parameterized using the log transformation to avoid possible boundary problems.

The profile likelihood for mu is calculated constraining the log(sigma2) parameter at the maximum likelihood value while varying mu. Likewise, the profile likelihood for log(sigma2) is calculated constraining the mu parameter at the maximum likelihood value while varying the log(sigma2) parameter.

Click on the log-likelihood surface to indicate a starting position for the estimation. Then, there are several techniques that can be used to update the estimated parameter vector. Some techniques will almost always jump to the final solution in a single step while others will take several steps to converge to the solution (marked with a blue line in the horizontal and vertical directions). Estimation can be restarted with different starting values at any time by clicking on the contour plot.

When using Newton-Raphson or Fisher scoring, you can also try to optimize the stepsize. With these techniques, optimizing the stepsize overcomes bad steps where the algorithm attempts to step too far or not far enough toward the solution. Regardless, the methods also employ (automatically) Marquardt's modification such that the diagonals of the negative Hessian are increased if the matrix is singular. Optimizing stepsize has no effect on the other techniques. Rather, the other techniques always optimize the stepsize.

You will likely see no difference in the other 4 methods. For this particular problem, they should all jump to the final estimate in one step.

Several statistics are shown for the optimization process. The main difference in the techniques is the calculation of the negative H matrix. Note that this matrix is included in the output. If optimzing the stepsize, the multiplicative factor of the stepsize is included in the output.

The mu parameter is labeled beta_0 and the log(sigma2) parameter is labeled beta_1 in the output.