## Methods to Assess an Exercise Intervention Trial based on Three-Level Functional Data

### Ordered Data Analysis, Models and Health Research Methods Conference

#### Research Assistant Professor

Joint work with Haocheng Li et al.

## What is longitudinal functional/multi-level functional data ?

The basic observational unit is a function.

## Mercer theorem and K-L expansion

Let $Y(t)\in L^{2}[0,1],$ with $\mu(t)=E\left\{ Y(t)\right\}$ and $K(s,t)=\text{cov}\left\{ Y(s),Y(t)\right\} .$
Spectral decomposition of $K(s,t)$ (Mercer's Theorem): $$K(s,t)=\sum_{l=1}^{\infty}\lambda_{l}f_{l}(s)f_{l}(t)$$ where $\lambda_{1}\geq\lambda_{2}\geq\cdots\geq0$ are eigenvalues and $f_{l}$'s are the corresponding orthonormal eigenfunctions.
Karhunen-Loève (K-L) expansion: $$Y(t)=\mu(t)+\sum_{l=1}^{\infty}f_{l}(t)\alpha_{l}$$ where $\alpha_{l}=\int_{0}^{1}\left\{ Y(t)-\mu(t)\right\} f_{l}(t)dt$ are $\overset{\text{unc}}{\sim}\left(0,\lambda_{l}\right)$ called PC scores or loadings. $\left\{ \lambda_{l},\, f_{l}\right\} _{l}$ are as defined in Mercer's Thm.

# Data and Model Setup

## Exercise Intervention Trial Data

• Data are from (Kozey-Keadle et al., 2013).
• Consist of estimates of relative energy expenditure (metabolic units, or METs) on $n=63$ individuals every 5 minutes, 5 days a week for 5 separate weeks (study weeks 0, 3, 6, 9, and 12).
• The treatment assignment was done after the baseline week and consisted of assignment to either a control arm or an exercise intervention where subjects completed a standardized aerobic exercise program.
• Interested in the patterns of physical activity for individuals and how these patterns vary across the treatment groups.
• Methodology for the analysis of three-level functional data is very limited and non-existent for the analysis of physical activity data.

## Model Setup

Let $Y_{ijk}(t)$ be the response at time $t$ for subject $i=1,...,n$, in week $j$ $\left(j=1,...,J\right)$ on day $k$ $\left(k=1,...,K\right)$ \begin{eqnarray*} {Y_{ijk}(t)} & = & \overset{\text{fixed-effect curves}}{\overbrace{\mu_{\cdot\cdot}(t)+\mu_{j\cdot}(t)+\mu_{\cdot k}(t)+\mu_{jk}(t)}}\\ & & +\underset{\text{random-effect curves}}{\underbrace{\boldsymbol{\xi}_{i}(t)+\boldsymbol{\eta}_{ij}(t)+\boldsymbol{\zeta}_{ik}(t)+\boldsymbol{\gamma}_{ijk}(t)}}+\epsilon_{ijk}(t) \end{eqnarray*} Where

• $\mu_{\cdot\cdot}(t)$ are the population mean curves, $\mu_{j\cdot}(t)$, $\mu_{\cdot k}(t)$ and $\mu_{jk}(t)$ are week-specific, day-specific and week$\times$day interaction mean curves.

\begin{eqnarray*} {Y_{ijk}(t)} & = & \overset{\text{fixed-effect curves}}{\overbrace{\mu_{\cdot\cdot}(t)+\mu_{j\cdot}(t)+\mu_{\cdot k}(t)+\mu_{jk}(t)}}\\ & & +\underset{\text{random-effect curves}}{\underbrace{\boldsymbol{\xi}_{i}(t)+\boldsymbol{\eta}_{ij}(t)+\boldsymbol{\zeta}_{ik}(t)+\boldsymbol{\gamma}_{ijk}(t)}}+\epsilon_{ijk}(t) \end{eqnarray*}

• $\boldsymbol{\xi}_{i}(t),\,\boldsymbol{\eta}_{ij}(t),\,\boldsymbol{\zeta}_{ik}(t),\,\text{ and }\boldsymbol{\gamma}_{ijk}(t)$ are mutually independent.
• They represent the subject-specific , week-within-subject, day-within-subject and week $\times$ day interaction-within-subject random effects curves, respectively.
• $\epsilon_{ijk}(t)$ $\overset{\text{unc}}{\sim}WN\left(0,\,\sigma^{2}\right)$
• For identifiability, we constrain $\mu_{1\cdot}(t)=\mu_{\cdot1}(t)=\mu_{1k}(t)=\mu_{j1}(t)=0$
• for all $k,j$.

## Covariance structures

Same subject, week, day \begin{eqnarray*} \text{cov}\left\{ Y_{ijk}(t),Y_{ijk}(s)\right\} & = & \text{cov}\left\{ \boldsymbol{\xi}_{i}(t),\boldsymbol{\xi}_{i}(s)\right\} +\text{cov}\left\{ \boldsymbol{\eta}_{ij}(t),\boldsymbol{\eta}_{ij}(s)\right\} \\ & & +\text{cov}\left\{ \boldsymbol{\zeta}_{ik}(t),\boldsymbol{\zeta}_{ik}(s)\right\} +\text{cov}\left\{ \boldsymbol{\gamma}_{ijk}(t),\boldsymbol{\gamma}_{ijk}(s)\right\} \end{eqnarray*} Same subject, week, different day $$k\ne k^{\prime},\,\text{cov}\left\{ Y_{ijk}(t),Y_{ijk^{\prime}}(s)\right\} =\text{cov}\left\{ \boldsymbol{\xi}_{i}(t),\boldsymbol{\xi}_{i}(s)\right\} +\text{cov}\left\{ \boldsymbol{\eta}_{ij}(t),\boldsymbol{\eta}_{ij}(s)\right\}$$ Same subject, day, different week $$j\ne j^{\prime},\,\text{cov}\left\{ Y_{ijk}(t),Y_{ij^{\prime}k}(s)\right\} =\text{cov}\left\{ \boldsymbol{\xi}_{i}(t),\boldsymbol{\xi}_{i}(s)\right\} +\text{cov}\left\{ \boldsymbol{\zeta}_{ik}(t),\boldsymbol{\zeta}_{ik}(s)\right\}$$ And, same subject only $$j\ne j^{\prime},\, k\ne k^{\prime},\,\text{cov}\left\{ Y_{ijk}(t),Y_{ij^{\prime}k^{\prime}}(s)\right\} =\text{cov}\left\{ \boldsymbol{\xi}_{i}(t),\boldsymbol{\xi}_{i}(s)\right\}$$

## The Karhunen-Loève (K-L) expansion

Applying (K-L) expansion on random effect curves, \begin{eqnarray*} \xi_{i}(t) & = & \sum_{l=1}^{P_{\xi}}f_{\xi,l}(t)\alpha_{\xi,i,l}=\boldsymbol{f}_{\xi}^{T}\boldsymbol{\alpha}_{\xi,i}\text{ ; }\eta_{ij}(t)=\sum_{l=1}^{P_{\eta}}f_{\eta,l}(t)\alpha_{\eta,ij,l}=\boldsymbol{f}_{\eta}^{T}\boldsymbol{\alpha}_{\eta,ij}\\ \zeta_{ik}(t) & = & \sum_{l=1}^{P_{\zeta}}f_{\zeta,l}(t)\alpha_{\zeta,ik,l}=\boldsymbol{f}_{\zeta}^{T}\boldsymbol{\alpha}_{\zeta,ik}\text{ ; }\gamma_{ijk}(t)=\sum_{l=1}^{P_{\gamma}}f_{\gamma,l}(t)\alpha_{\gamma,ijk,l}=\boldsymbol{f}_{\gamma}^{T}\boldsymbol{\alpha}_{\gamma,ijk} \end{eqnarray*} where $\int f_{\dagger,l}(t)f_{\dagger,l^{\prime}}(t)dt=\delta_{ll^{\prime}}\text{ where }\dagger\text{ is one of }\xi,\eta,\zeta,\gamma$

## Modeling with B-splines

Let $b(t)=\left\{ b_{1}(t),...,b_{q}(t)\right\} ^{T}$ be the $q\times1$ vector of B-splines basis fcns. evaluated at $t$.
We model the fixed effects as $\mu_{\cdot\cdot}(t)=b^{T}(t)\beta_{\cdot\cdot};\,\,\mu_{j\cdot}(t)=b^{T}(t)\beta_{j\cdot};\ \mu_{\cdot k}(t)=b^{T}(t)\beta_{\cdot k};\ \mu_{jk}(t)=b^{T}(t)\beta_{jk}$
and the FPC eigenfunctions as $f_{\xi,l}(t)=b^{T}(t)\mathbf{g}_{\xi,l};\,\, f_{\eta,l}(t)=b^{T}(t)\mathbf{g}_{\eta,l};\,\, f_{\zeta,l}(t)=b^{T}(t)\mathbf{g}_{\zeta,l};\,\, f_{\gamma,l}(t)=b^{T}(t)\mathbf{g}_{\gamma,l}$
To maintain the orthogonality restrictions on the eigenfunctions, $b(t),\,\mathbf{g}_{\xi,l},\,\mathbf{g}_{\eta,l},\,\mathbf{g}_{\zeta,l}$ and $\mathbf{g}_{\gamma,l}$ are restricted to be orthogonal.

For identifiability, we constrain $\beta_{1\cdot}(t)=\beta_{\cdot1}(t)=\beta_{1k}(t)=\beta_{j1}(t)=0$ for all $k,j$.

## Summary 1

We started with \begin{eqnarray*} Y_{ijk}(t) & = & \mu_{\cdot\cdot}(t)+\mu_{j\cdot}(t)+\mu_{\cdot k}(t)+\mu_{jk}(t)\\ & & +\boldsymbol{\xi}_{i}(t)+\boldsymbol{\eta}_{ij}(t)+\boldsymbol{\zeta}_{ik}(t)+\boldsymbol{\gamma}_{ijk}(t)+\epsilon_{ijk}(t) \end{eqnarray*}

• Apply K-L expansion on $\boldsymbol{\xi}_{i}(t),\,\boldsymbol{\eta}_{ij}(t),\,\boldsymbol{\zeta}_{ik}(t),\,\boldsymbol{\gamma}_{ijk}(t)$
• Represent the fixed effects $\mu_{\cdot\cdot}(t),\,\mu_{j\cdot}(t),\,\mu_{\cdot k}(t),\,\mu_{jk}(t)$ and the eigenfunctions ($f_{\dagger,l}(t)=b^{T}(t)\mathbf{g}_{\dagger,l}$) using B-splines.

The resulting model is \begin{eqnarray*} Y_{ijk}(t) & = & b^{T}(t)\beta_{\cdot\cdot}+b^{T}(t)\beta_{j\cdot}+b^{T}(t)\beta_{\cdot k}+b^{T}(t)\beta_{jk}\\ & + & b^{T}(t)\boldsymbol{G}_{\xi}\boldsymbol{\alpha}_{\xi,i}+b^{T}(t)\boldsymbol{G}_{\eta}\boldsymbol{\alpha}_{\eta,ij}+b^{T}(t)\boldsymbol{G}_{\zeta}\boldsymbol{\alpha}_{\zeta,ik}+b^{T}(t)\boldsymbol{G}_{\gamma}\boldsymbol{\alpha}_{\gamma,ijk}\\ & + & \epsilon_{ijk}(t) \end{eqnarray*} where $\underset{q\times P_{\dagger}}{\boldsymbol{G}_{\dagger}}=\left(\mathbf{g}_{\dagger,1},...,\mathbf{g}_{\dagger,P_{\dagger}}\right)$ and $\dagger$ is one of $\xi,\eta,\zeta,\gamma$.

## Link to the Linear Mixed Model (LMM)

• Let $\Delta_{\xi}=\text{cov}\left(\boldsymbol{\alpha}_{\xi,i}\right)$, $\Delta_{\eta}=\text{cov}\left(\boldsymbol{\alpha}_{\eta,ij}\right)$, $\Delta_{\zeta}=\text{cov}\left(\boldsymbol{\alpha}_{\zeta,ik}\right)$, and $\Delta_{\gamma}=\text{cov}\left(\boldsymbol{\alpha}_{\gamma,ijk}\right)$. $\Delta_{\dagger}$ is a $P_{\dagger}\times P_{\dagger}$ ($\dagger$ is one of $\xi,\eta,\zeta,\gamma$) diagonal matrix with decreasing positive diagonal elements.
• Define new $q-$dim r.v. $\underset{q\times1}{\boldsymbol{u}_{\xi,i}}=\underset{q\times P_{\xi}}{\boldsymbol{G}_{\xi}}\underset{P_{\xi}\times1}{\boldsymbol{\alpha}_{\xi,i}};\,\,\boldsymbol{u}_{\eta,ij}=\boldsymbol{G}_{\eta}\boldsymbol{\alpha}_{\eta,ij};\,\,\boldsymbol{u}_{\zeta,ik}=\boldsymbol{G}_{\zeta}\boldsymbol{\alpha}_{\zeta,ik};\,\,\boldsymbol{u}_{\gamma,ijk}=\boldsymbol{G}_{\gamma}\boldsymbol{\alpha}_{\gamma,ijk}$
• The resulting model is \begin{eqnarray} Y_{ijk}(t) & = & b^{T}(t)\beta+b^{T}(t)\beta_{j\cdot}+b^{T}(t)\beta_{\cdot k}+b^{T}(t)\beta_{jk} \qquad (1)\\ & + & b^{T}(t)\boldsymbol{u}_{\xi,i}+b^{T}(t)\boldsymbol{u}_{\eta,ij}+b^{T}(t)\boldsymbol{u}_{\zeta,ik}+b^{T}(t)\boldsymbol{u}_{\gamma,ijk}+\epsilon_{ijk}(t)\nonumber \end{eqnarray} with
$\text{cov}\left(\boldsymbol{u}_{\xi,i}\right)=\psi_{\xi}=\boldsymbol{G}_{\xi}\Delta_{\xi}\boldsymbol{G}_{\xi}^{T}$; $\text{cov}\left(\boldsymbol{u}_{\eta,ij}\right)=\psi_{\eta}=\boldsymbol{G}_{\eta}\Delta_{\eta}\boldsymbol{G}_{\eta}^{T}$; $\text{cov}\left(\boldsymbol{u}_{\zeta,ik}\right)=\psi_{\zeta}=\boldsymbol{G}_{\zeta}\Delta_{\zeta}\boldsymbol{G}_{\zeta}^{T};$ and $\text{cov}\left(\boldsymbol{u}_{\gamma,ijk}\right)=\psi_{\gamma}=\boldsymbol{G}_{\gamma}\Delta_{\gamma}\boldsymbol{G}_{\gamma}^{T}$
• If $P_{\xi}=P_{\zeta}=P_{\eta}=P_{\gamma}=q$, then $\psi_{\xi},\,\psi_{\zeta},\,\psi_{\eta}$ and $\psi_{\gamma}$ would be full rank and model (1) is equivalent to a 3-level LMM with unstructured random effect covariance matrices.

## Estimation Procedure

\begin{eqnarray*} Y_{ijk}(t) & = & b^{T}(t)\beta_{\cdot\cdot}+b^{T}(t)\beta_{j\cdot}+b^{T}(t)\beta_{\cdot k}+b^{T}(t)\beta_{jk}\\ & + & b^{T}(t)\boldsymbol{u}_{\xi,i}+b^{T}(t)\boldsymbol{u}_{\eta,ij}+b^{T}(t)\boldsymbol{u}_{\zeta,ik}+b^{T}(t)\boldsymbol{u}_{\gamma,ijk}+\epsilon_{ijk}(t) \end{eqnarray*} Let $\boldsymbol{\beta}$ be a $JKq-$ vector containing all the $\beta_{\ddagger}$ and $\boldsymbol{U}_{i}$ be a $\left\{ (J+1)(K+1)q\right\}-$ r.v. containing all the $\boldsymbol{u}_{\dagger}$. Then, model (1) can be expressed as $\underset{N_{i}\times1}{Y_{i}}=B_{i}^{\mu}\boldsymbol{\beta}+B_{i}^{U}\boldsymbol{U}_{i}+\epsilon_{i}$ with $$\Psi\overset{\text{def}}{=}\text{cov}\left(\boldsymbol{U}_{i}\right)=\text{diag}\left(\psi_{\xi},I_{J}\otimes\psi_{\eta},I_{K}\otimes\psi_{\zeta},I_{JK}\otimes\psi_{\gamma}\right)\qquad (2)$$ $$V_{i}\overset{\text{def}}{=}\text{cov}\left(Y_{i}\right)/\sigma^{2}=I_{N_{i}}+B_{i}^{U}\Psi\left(B_{i}^{U}\right)^{T}/\sigma^{2}\phantom{I_{JK}\otimes\psi_{\gamma}12}\qquad(3)$$ $$S_{i}\overset{\text{def}}{=}\text{cov}\left(\boldsymbol{U}_{i}|Y_{i}\right)/\sigma^{2}=\Psi/\sigma^{2}-\Psi\left(B_{i}^{U}\right)^{T}V_{i}^{-1}B_{i}^{U}\Psi/\sigma^{4}\qquad(4)$$ $\left(\boldsymbol{\beta}_{\text{curr}},\,\sigma_{\text{curr}}^{2},\,\Psi_{\text{curr}}^{\text{RR}}\right)\overset{(3),(4)}{\longrightarrow}\left(S_{i,\text{curr}},\, V_{i,\text{curr}}\right)\overset{ECME}{\longrightarrow}\left(\boldsymbol{\beta}_{\text{new}},\,\sigma_{\text{new}}^{2},\,\Psi_{\text{new}}^{\text{full}}\right)\overset{\text{RR model}}{\longrightarrow}\Psi_{\text{new}}^{\text{RR}}$ The ECME step and the Reduced-Rank (RR) model will be described in the next slide.

## ECME (Schafer, 1998) and the Reduced-Rank (RR) Model

\begin{eqnarray*} \text{ECME:}\begin{cases} \beta_{\text{new}} & =\left(\sum_{i=1}^{n}\left(B_{i}^{\mu}\right)^{T}V_{i,\text{curr}}^{-1}B_{i}^{\mu}\right)^{-1}\left(\sum_{i=1}^{n}\left(B_{i}^{\mu}\right)^{T}V_{i,\text{curr}}^{-1}Y_{i}\right)\\ \sigma_{\text{new}}^{2} & =N^{-1}\sum_{i=1}^{n}\left(Y_{i}-B_{i}^{\mu}\boldsymbol{\beta}_{\text{new}}\right)^{T}V_{i,\text{curr}}^{-1}\left(Y_{i}-B_{i}^{\mu}\boldsymbol{\beta}_{\text{new}}\right)\\ \Psi_{\text{new}}^{\text{full}} & =n^{-1}\sum_{i=1}^{n}\left(\hat{\boldsymbol{U}}_{i}\hat{\boldsymbol{U}}_{i}^{T}+\sigma_{\text{new}}^{2}S_{i,\text{curr}}\right)\text{ and }\hat{\boldsymbol{U}}_{i}=S_{i,\text{curr}}\left(B_{i}^{U}\right)^{T}\left(Y_{i}-B_{i}^{\mu}\boldsymbol{\beta}_{\text{new}}\right) \end{cases} \end{eqnarray*}
How to obtain $\Psi_{\text{curr}}^{\text{RR}}$ ? (Recall: $\dagger$ is one of $\xi$, $\eta$, $\zeta$ or $\gamma$)

• Extract all $\psi_{\dagger,\text{curr}}$ from the diagonal of $\Psi_{\text{new}}^{\text{full}}$ and perform eigenvalue decomp $\underset{q\times q}{\psi_{\dagger,\text{curr}}}=\tilde{\boldsymbol{G}}_{\dagger,\text{new}}\tilde{\Delta}_{\dagger,\text{new}}\tilde{\boldsymbol{G}}_{\dagger,\text{new}}^{T}$
• Select the number of principal components $(P_{\xi},\, P_{\eta},\, P_{\zeta},\, P_{\gamma})$ to keep, by requiring the ratio of variance explained to be $\geq P=0.85$.

• Retain $1^{\text{st}}$ $P_{\dagger}$ eigenvectors and eigenvalues from $\tilde{G}_{\dagger,\text{new}}$ and $\tilde{\Delta}_{\dagger,\text{new}}$ to form $\underset{q\times P_{\dagger}}{\boldsymbol{G}_{\dagger,\text{new}}}$ and $\underset{P_{\dagger}\times P_{\dagger}}{\Delta_{\dagger,\text{new}}}$.

• Obtain the updated RR matrices $\psi_{\dagger,\text{new}}^{\text{RR}}=\boldsymbol{G}_{\dagger,\text{new}}\Delta_{\dagger,\text{new}}\boldsymbol{G}_{\dagger,\text{new}}^{T}$ and combine these to get $\Psi_{\text{new}}^{\text{RR}}$ according to (2).

## Simulation: Model Fit

A measurement at time $t$ on day $k$ in week $j$ for subject $i$ is generated according to \begin{eqnarray*} Y_{ijk}(t) & = & \mu_{\cdot\cdot}(t)+\mu_{j\cdot}(t)+\mu_{\cdot k}(t)+\sum_{l=1}^{2}f_{\xi,l}(t)\alpha_{\xi,i,l}+\sum_{l=1}^{2}f_{\eta,l}(t)\alpha_{\eta,ij,l}\\ & & +\sum_{l=1}^{2}f_{\zeta,l}(t)\alpha_{\zeta,ik,l}+\sum_{l=1}^{2}f_{\gamma,l}(t)\alpha_{\gamma,ijk,l}+\epsilon_{ijk}(t) \end{eqnarray*}

• $n=60$ subjects, $J=5$ weeks and $K=5$ days and each day has $36$
• measurement times.
• B-spline basis functions with 10 equispaced knots to fit the data.
The average estimates and mean squared errors (MSE) of the parameters in the joint model. The number marked with an asterisk is the actual number multiplied by 1000.
Parameter $\sigma^2$ $\Delta_{\xi,1}$ $\Delta_{\xi,2}$ $\Delta_{\eta,1}$ $\Delta_{\eta,2}$ $\Delta_{\zeta,1}$ $\Delta_{\zeta,2}$ $\Delta_{\gamma,1}$ $\Delta_{\gamma,2}$
True 1.00 8.00 4.00 6.00 3.00 4.00 2.00 2.00 1.00
Mean 1.00 8.33 3.86 6.00 2.86 3.89 2.00 1.95 0.93
MSE $0.08^*$ 2.11 0.55 0.26 0.11 0.18 0.04 0.03 0.02

## Wald Test Performance

Performance of the Wald statistics to test $H_{0}:\mu_{jk}(t)=0$ for all $j,k$.

• Complete and incomplete data scenarios are considered.
• $500$ replicates for $n=60$ and $200$ replicates for $n=250$.
• The probability that a days record is observed is $50\%$.

The average estimates and mean squared errors (MSE) of the parameters in the joint model. The number marked with an asterisk is the actual number multiplied by 1000.
Parameter $\sigma^2$ $\Delta_{\xi,1}$ $\Delta_{\xi,2}$ $\Delta_{\eta,1}$ $\Delta_{\eta,2}$ $\Delta_{\zeta,1}$ $\Delta_{\zeta,2}$ $\Delta_{\gamma,1}$ $\Delta_{\gamma,2}$
True 1.00 8.00 4.00 6.00 3.00 4.00 2.00 2.00 1.00
Mean 1.00 8.33 3.86 6.00 2.86 3.89 2.00 1.95 0.93
MSE $0.08^*$ 2.11 0.55 0.26 0.11 0.18 0.04 0.03 0.02

## Data Description

• The activPALTM is an electronic device used to estimate energy expenditure, which is expressed in units of metabolic equivalents (METs).

• A persons MET value for an activity is defined as the ratio of her energy expenditure during the activity to her resting energy expenditure.

• A value of METs $\geq3$ defines moderate to vigorous physical activity (MVPA).

• In the study, $63$ individuals wore the activPALTM for five weeks (denoted weeks $0$, $3$, $6$, $9$, $12$), five days in a week (Monday to Friday) and measurements were recorded every $5$ minutes during each day.

• We limited the data for each day to one hour before and two hours after in the days first instance of MVPA, if it occurs.

## Summary

• Proposed a general modeling and estimation strategy in the presence of three-level correlated functional data.
• Our algorithm can handle incomplete data and employs a data-based method to select the number of principal components for the random curves
• Simulation results for the model fitting strategy are encouraging and indicate little bias.
• We also discussed the performance of the Wald test to assess the evidence that linear combinations of mean structure parameters are zero
• We applied our model to analyze data from a physical activity intervention trial. The detailed functional data analysis revealed how the patterns of activity differed in terms of timing and intensity.