October 26, 2009
			Profiling the deviance

In looking at profiles of the deviance with respect to the parameters
of a linear mixed model I think that the techniques is more effective
than MCMC and that it has better theoretical justification than
approximating degrees of freedom for t-tests and t-based intervals on
the fixed effects.

For the fixed-effects parameters the profiling is straightforward
because they are coefficients in the linear predictor.  The
interesting thing about the examples is that the change in the
deviance (the likelihood ratio test statistic) does not follow exactly
a chisquare-1 pattern.  The profile z (signed square root of the
likelihood ratio test statistic) indicates wider intervals than would
be determined from the standard error.  In other words, the
distribution appears to be overdispersed relative to the normal
distribution, in the manner in which a T statistic is overdispersed.
Another characteristics of the profile z for the fixed-effects is that
the slope of the profile z at the parameter estimate is shallower than
would be indicated by the standard error, especially the standard
error calculated from the ML results.  I attribute this phenomenon to
an anti-conservative calculation of the standard error.  It is being
calculated conditional on the estimated value of the theta parameters.
If instead you re-fit the model you end up with intervals on the
fixed-effects parameters that more properly reflects the variability
in all the parameters in the model.

I think this will be very effective.  The current profile method for
class lmerenv profiles the fixed-effects parameters and I have a
function for plotting these profiles.  They can also be used to create
confidence intervals and to create the p-values that everyone finds so
important (although only for hypotheses related to a single
coefficient).

I have been working on profiling the other parameters in the model as
well and this is where things get tricky.  The parameters in the
optimization are the theta parameters, which I don't think are
directly of interest to most users.  I can switch to theta and sigma
at the risk of some redundancy (once you have done the work needed to
calculate the profiled deviance for theta and sigma you could just as
easily profile out sigma).  The (unexported) R function devfun in
lme4 takes a fitted model and returns a function with attributes that
evaluates the profiled deviance for the composite parameter (theta,
log(sigma)).  However, I don't think even that is enough because it
does not give profiles of the variances and covariances of the random
effects directly.  I think I will be better off writing a function
that takes the variances, covariances (or, better, correlations) and
log(sigma) to evaluate the profiled deviance.

The reason that I use log(sigma) but not log of the variances of the
random effects is because sigma cannot be zero except for constructed
data whereas the estimates of the variance of the random effects can
reasonably be zero.

I need to get out to the bus stop soon so this next part might be a
little too terse.  It is straightforward to take the theta values and
the sigma value and produce variances and covariances.  That is done
in VarCorr.  The part that still has me pondering is how to go the
other way reasonably.  That is, how do I take the variances and
covariances of the random effects plus sigma and get the theta
parameters.

Anyway, that is what I am currently contemplating.

October 27, 2009

I did some more work on profiling the variance-covariance parameters
in a linear mixed-effects model and now have a version I can check
in.  The current code is still pretty rough but it should be enough to
start.  There is a lot more that can be done more artistically about
choosing step sizes and adapting the step sizes according to the
current results but this should be enough to get going.

I have been thinking about the kind of structure that should be
returned by the profile method.  For profile.nls the structure is a
list with elements corresponding to the parameters that were
profiled.  Each element is a data frame containing a vector called tau
and a matrix of parameter values.  I'm going to stay with that for the
time being although I will change the name from tau to zeta (it is the
nonlinear analogue of a standard normal, not a t statistic).
Eventually this object should be reconciled with the profile objects
in the stats4 package.  It makes sense for these to be an S4 class.
