Statistical Analysis Plan

CO-LUMINATE study SAP integrating depression harmonization, social determinant trajectories, and behavioral mediators workstreams.

Overview

This document is the integrated CO-LUMINATE Statistical Analysis Plan (SAP). It is organized by analytical objective and structured for staged publication. The current release includes:

Section A: Objective 2 (Depression Harmonization)
Section B: Objective 3 (Social Determinant Trajectories)
Section C: Objective 4 (Behavioral Mediators)

Section D (Objective 5: Causal Mediation) remains an intentional placeholder pending the finalization of prespecified mediation model families and implementation diagnostics.

Section A: Objective 2 - Depression Harmonization SAP

A1 Background and Rationale

This SAP component defines the psychometric protocol for harmonizing depression measurement between the Shona Symptom Questionnaire (SSQ) and the Patient Health Questionnaire (PHQ-9). The methodological challenge is multi-instrument comparability in a longitudinal setting where cohorts differ in instrument availability and local validity context.

The PHQ-9 is treated as the external benchmark for probable depression (Kroenke, Spitzer, and Williams 2001), while the SSQ captures culturally grounded distress phenotypes relevant to Southern African settings (Patel et al. 1997; Haney et al. 2014; Chibanda et al. 2019). Harmonization therefore requires defensible latent-construct alignment, subgroup stability checks, and threshold calibration rather than simple raw-score translation (Putnick and Bornstein 2016; Woods 2009).

A3 Objectives and Estimands

Let \(Y_i^{PHQ} = \mathbb{1}\{PHQ9_i \ge 10\}\) denote PHQ-referenced probable depression for participant \(i\).

The primary estimand is the harmonized binary endpoint:

\[ Y_i^{H} = \mathbb{1}\{\hat{\theta}_i \ge \tau^\ast\}, \]

Parameter definitions (Eq. A3.1):

\(Y_i^{H}\): harmonized depression endpoint for participant \(i\).
\(\mathbb{1}\{\cdot\}\): indicator function (equals 1 when the condition is true, 0 otherwise).
\(\hat{\theta}_i\): calibrated latent depression score for participant \(i\).
\(\tau^\ast\): chosen latent-score threshold linked to the PHQ-9 benchmark.

where \(\hat{\theta}_i\) is the calibrated latent depression score from joint PHQ-SSQ modeling and \(\tau^\ast\) is the pre-specified threshold linked to the PHQ-9 reference scale.

For the PHQ-referenced comparator used above, \(PHQ9_i\) denotes the total PHQ-9 score for participant \(i\), with the threshold of 10 defining probable depression on the benchmark instrument.

Secondary estimands include calibration transportability and prevalence alignment in SSQ-only cohorts:

\[ \Delta_p = \Pr(Y_i^{H}=1) - \Pr(Y_i^{PHQ}=1), \]

Parameter definitions (Eq. A3.2):

\(\Delta_p\): prevalence drift between harmonized and PHQ-referenced endpoints.
\(\Pr(Y_i^{H}=1)\): marginal prevalence under the harmonized endpoint.
\(\Pr(Y_i^{PHQ}=1)\): marginal prevalence under the PHQ-referenced endpoint.

interpreted as endpoint-prevalence drift under candidate thresholding strategies.

A4 Study Design and Analytic Samples

The development sample comprises participants aged 16-24 years with concurrent SSQ and PHQ observations. The transport sample consists of SSQ-only records in linked cohorts.

Eligibility is staged by model:

factor-analytic stages require complete item vectors for included indicators;
IRT calibration requires complete item responses under the final item set;
endpoint assignment is restricted to records with valid latent score estimates.

A5 Structural Modeling Strategy

For ordinal symptom items, exploratory and confirmatory analyses are estimated using polychoric correlation structures and robust estimators for ordered-categorical data (Holgado–Tello et al. 2010; Wu and Estabrook 2016).

Exploratory factor analysis (EFA) is run in Monte Carlo split samples to stabilize dimensionality decisions. Confirmatory factor analysis (CFA) is then evaluated on hold-out subsets with fit diagnostics interpreted jointly (CFI, RMSEA, SRMR) rather than by single-cutoff rules (Hu and Bentler 1999; MacCallum et al. 2002).

Let \(\mathbf{x}_i\) denote the ordinal item vector and \(\theta_i\) the latent depression factor. The CFA measurement model is:

\[ \mathbf{x}_i \sim f(\theta_i; \boldsymbol{\lambda}, \boldsymbol{\tau}), \]

Parameter definitions (Eq. A5.1):

\(\mathbf{x}_i\): vector of observed ordinal symptom responses for participant \(i\).
\(\theta_i\): latent depression factor score.
\(f(\cdot)\): ordinal measurement response function implied by the CFA specification.
\(\boldsymbol{\lambda}\): factor-loading parameters.
\(\boldsymbol{\tau}\): item-threshold parameters.

where \(\boldsymbol{\lambda}\) are loadings and \(\boldsymbol{\tau}\) are item thresholds.

Measurement invariance and differential item functioning are evaluated across sex and age strata using parameter-constraint comparisons and practical fit-change diagnostics (Putnick and Bornstein 2016; Vandenberg and Lance 2000; Woods 2009).

A6 IRT Harmonization and Threshold Derivation

Harmonization uses graded response models (GRM) implemented in mirt (Samejima 1969; Chalmers 2012). For item \(j\) and category threshold \(k\):

\[ \Pr(X_{ij} \ge k \mid \theta_i) = \operatorname{logit}^{-1}\left[a_j(\theta_i - b_{jk})\right]. \]

Parameter definitions (Eq. A6.1):

\(\Pr(X_{ij} \ge k \mid \theta_i)\): cumulative probability of endorsing category \(k\) or higher for item \(j\).
\(X_{ij}\): ordinal response of participant \(i\) on item \(j\).
\(\theta_i\): latent depression severity for participant \(i\).
\(a_j\): GRM discrimination parameter for item \(j\).
\(b_{jk}\): GRM threshold/location parameter for item \(j\) at threshold \(k\).
\(\operatorname{logit}^{-1}(\cdot)\): inverse-logit link mapping linear predictors to probabilities.

Joint PHQ-SSQ calibration maps both instruments to a common latent continuum. The anchored rule selects \(\tau^\ast\) via PHQ test-characteristic alignment to the clinical PHQ-9 benchmark near 10, while ROC/Youden and sensitivity-prioritized alternatives are retained for robustness comparison.

A7 Performance Metrics and Uncertainty

Endpoint performance is summarized by:

sensitivity: \(\Pr(Y^H=1 \mid Y^{PHQ}=1)\)
specificity: \(\Pr(Y^H=0 \mid Y^{PHQ}=0)\)
PPV, NPV, accuracy, AUC, and Cohen’s \(\kappa\)
prevalence drift \(\Delta_p\)

Metric definitions: sensitivity is the probability that the harmonized endpoint is positive among PHQ-positive participants; specificity is the probability that the harmonized endpoint is negative among PHQ-negative participants; PPV and NPV denote positive and negative predictive value, respectively; AUC is the area under the receiver-operating-characteristic curve; and Cohen’s \(\kappa\) summarizes agreement beyond chance.

Interval estimation uses resampling-based uncertainty summaries where appropriate (Efron and Tibshirani 1994). Threshold selection is governed by a joint decision criterion balancing discrimination, calibration, and prevalence distortion rather than discrimination alone.

A8 Planned Outputs and Reporting

Planned outputs are:

Item and structural diagnostics (EFA/CFA, invariance, DIF).
IRT calibration tables and threshold-mapping plots.
Comparative endpoint performance tables across threshold strategies.
Final harmonized endpoint definition for downstream Objectives 3-5.

A9 Symbol-to-Implementation Crosswalk (Depression Harmonization SAP)

To align the Depression Harmonization SAP (Objective 2) notation with implementation evidence in the public repository, Table A9.1 maps each core quantity to scripts under scripts/psychometric/ and to the corresponding generated output artifacts.

SAP quantity	Mathematical definition in this SAP	Implementation locus (OBJ03 Psychometric)	Primary output artifact(s)
\(\mathbf{x}_i\), retained item structure	Observed ordinal symptom response vector used in EFA/CFA stages	EFA stability and retention pipeline in `scripts/psychometric/programs/utils/efa_stability/` with orchestration via `scripts/psychometric/programs/utils/efa_stability/05_run_pipeline_efa_master.R`	Stability/retention figures in SCAR notebook outputs and EFA artifacts
\(\theta_i\), \(\boldsymbol{\lambda}\), \(\boldsymbol{\tau}\)	CFA latent trait and ordered-threshold measurement parameters	CFA + invariance + ROC workflow under `scripts/psychometric/programs/utils/cfa_roc/` (notably `03_mc_loop.R`, `04_summary.R`, `06_master.R`)	`03_IRT_fitted_models_1factor.rds`, `03_IRT_invariance_models_1factor.rds`, fit/invariance PDFs
\(a_j\), \(b_{jk}\)	GRM discrimination and threshold/location parameters	IRT calibration engine in `scripts/psychometric/programs/utils/irt/full_itr.R` and batch execution via `scripts/psychometric/programs/analysis/run_irt_batch.R`	`03_IRT_Harmonization_Results.rds`, batch model objects in `output/batch/`
\(\tau^\ast\) and operating-point scenarios	Applied latent threshold used for harmonized endpoint assignment	Threshold scenario comparison in `scripts/psychometric/programs/analysis/irt_roc_compare.R` with harmonization rules consolidated by batch outputs	`SSQ10_Harmonized_Scoring_Engine.rds`, ROC metric figures/tables
\(Y_i^H\) and prevalence alignment \(\Delta_p\)	Harmonized depression endpoint and drift vs PHQ-referenced prevalence	Endpoint performance summaries from IRT + ROC comparison outputs and Monte Carlo result summaries	`main_results_boot_10000_Isisekelo_Sempilo.rds`, `boot_10000_Isisekelo_Sempilo_PHQ-09_(Original)_results.rds`, ROC/performance figures

This crosswalk is scoped to Objective 2 implementation evidence in the public script bundle at scripts/psychometric/. Retired scripts are intentionally excluded from active SAP traceability in public release.

Section B: Objective 3 - Social Determinant Trajectories SAP

B1 Background and Rationale

Objective 3 models developmental trajectories of childhood social determinants and evaluates their longitudinal associations with harmonized depression outcomes. Trajectory methods are used to represent heterogeneity in exposure evolution across childhood windows, with explicit distinction between univariate and joint trajectory structures (Andruff et al. 2009; Nagin 2005; Muthén 2001).

B3 Objectives and Estimands

Let \(E_{it}\) denote exposure profile at developmental time \(t\), and \(C_i\) latent trajectory class.

Primary estimands:

\[ \pi_c = \Pr(C_i = c), \quad c = 1,\dots,K, \]

Parameter definitions (Eq. B3.1):

\(\pi_c\): population proportion assigned to trajectory class \(c\).
\(C_i\): latent class membership for participant \(i\).
\(K\): total number of retained classes.

and class-conditional depression risk contrasts:

\[ \text{OR}_c = \frac{\Pr(Y_i^H=1 \mid C_i=c)/\Pr(Y_i^H=0 \mid C_i=c)} {\Pr(Y_i^H=1 \mid C_i=c_{ref})/\Pr(Y_i^H=0 \mid C_i=c_{ref})}. \]

Parameter definitions (Eq. B3.2):

\(\text{OR}_c\): odds ratio for depression in class \(c\) relative to reference class \(c_{ref}\).
\(Y_i^H\): harmonized depression endpoint.
\(C_i\): latent trajectory class membership.
\(c_{ref}\): pre-specified reference class.
\(\Pr(Y_i^H=1 \mid C_i=c)\): conditional probability of depression in class \(c\).
\(\Pr(Y_i^H=0 \mid C_i=c)\): conditional probability of no depression in class \(c\).

Secondary estimands include posterior class-assignment quality and cross-window pathway consistency.

B4 Exposure Definitions and Time Windows

Exposure histories are organized into pre-specified developmental windows (0-5 years and 6-12 years), with harmonized coding of social determinants across cohorts. The objective is to preserve developmental ordering and minimize post-hoc window redefinition.

For the current Objective 3 analytical package, the active trajectory specifications are:

Univariate: ULWOM, ULWOF, UHSES
Joint: JPA, JPAES

The following specifications are intentionally retired from the active Objective 3 analysis set and are not part of current SAP implementation checks: UHC14, JPAC, JPACES.

B5 LCGA and Joint-Trajectory Strategy

Univariate LCGA is fitted per window and determinant family, followed by joint-trajectory integration where scientifically justified.

For participant \(i\) in class \(c\), the class-specific growth model is:

\[ E_{it} = \eta_{0c} + \eta_{1c}t + \eta_{2c}t^2 + \varepsilon_{it}, \quad \varepsilon_{it} \sim \mathcal{N}(0,\sigma_c^2). \]

Parameter definitions (Eq. B5.1):

\(E_{it}\): observed exposure value for participant \(i\) at time \(t\).
\(\eta_{0c}\): class-specific intercept (baseline level) for class \(c\).
\(\eta_{1c}\): class-specific linear slope.
\(\eta_{2c}\): class-specific quadratic slope.
\(\varepsilon_{it}\): residual term for participant \(i\) at time \(t\).
\(\sigma_c^2\): class-specific residual variance.
\(\mathcal{N}(0,\sigma_c^2)\): normal distribution with mean 0 and class-specific variance \(\sigma_c^2\).

Class-membership probabilities are estimated via multinomial logits:

\[ \Pr(C_i=c) = \frac{\exp(\alpha_c)}{\sum_{h=1}^{K}\exp(\alpha_h)}. \]

Parameter definitions (Eq. B5.2):

\(\Pr(C_i=c)\): probability that participant \(i\) belongs to class \(c\).
\(\alpha_c\): class-membership logit intercept for class \(c\).
\(K\): number of candidate classes in the fitted model.
\(\exp(\cdot)\): exponential function used to map logit-scale class parameters onto probabilities.
\(\sum_{h=1}^{K}\exp(\alpha_h)\): normalizing denominator summing over all candidate classes \(h\).

Model fitting is aligned with established mixture-model workflows, including three-step distal-outcome linkage where needed to reduce class-assignment bias (Asparouhov and Muthen 2014; Bakk, Oberski, and Vermunt 2016).

B6 Class Enumeration and Model Selection

Class enumeration combines information criteria and interpretability criteria:

BIC and sample-size adjusted BIC (Schwarz 1978)
entropy and posterior separation
bootstrap likelihood ratio test evidence (Nylund, Asparouhov, and Muthén 2007)
minimum class prevalence and substantive plausibility

Selection is finalized only when statistical fit and scientific interpretability converge.

B7 Association Modeling and Robustness

Depression association models condition on harmonized endpoint \(Y_i^H\) and include pre-specified confounders:

\[ \operatorname{logit}\Pr(Y_i^H=1) = \beta_0 + \sum_{c \ne c_{ref}}\beta_c\mathbb{1}(C_i=c) + \boldsymbol{\gamma}^\top\mathbf{Z}_i. \]

Parameter definitions (Eq. B7.1):

\(\Pr(Y_i^H=1)\): probability of harmonized depression for participant \(i\).
\(\beta_0\): model intercept.
\(\beta_c\): class-effect coefficient for class \(c\) relative to \(c_{ref}\).
\(\mathbb{1}(C_i=c)\): indicator for class membership in class \(c\).
\(\mathbf{Z}_i\): confounder vector for participant \(i\).
\(\boldsymbol{\gamma}\): confounder-effect coefficient vector.
\(\operatorname{logit}(\cdot)\): log-odds transformation, \(\log\{p/(1-p)\}\).
\(\sum_{c \ne c_{ref}}\): sum across non-reference trajectory classes.
\(\boldsymbol{\gamma}^\top \mathbf{Z}_i\): linear predictor contribution of the confounder vector.

Robustness analyses evaluate class-number alternatives, coding variants, and assignment-uncertainty sensitivity.

B8 Planned Outputs and Reporting

Planned outputs include trajectory-profile plots, class diagnostics, class-assignment summaries, and adjusted depression-risk tables for both univariate and joint models.

B9 Symbol-to-Implementation Crosswalk (Objective 3)

To ensure direct traceability from the mathematical SAP definitions to executable analysis components, Table B9.1 maps each core symbol or equation block to its implementation locus and output artifact family.

SAP quantity	Mathematical definition in this SAP	Implementation locus (Objective 3 pipeline)	Primary output artifact(s)
\(\pi_c\)	\(\Pr(C_i = c)\)	LCGA class-enumeration and class-proportion extraction in `run_lcga.R`; class diagnostics assembled in `report_univariate_lcga.R` and `report_joint_lcga.R`	`01_fit_summary_.html`, `06a_Final_Solution_Summary_.html`
\(C_i\)	latent trajectory class for participant \(i\)	LCGA solution object (`lcga_results_time_*.rds`) and posterior-assignment extraction in reporting scripts	`02_AvePP_.html`, `06a_Final_Solution_Summary_.html`
\(\eta_{0c},\eta_{1c},\eta_{2c}\)	class-specific intercept and slope parameters	growth-factor estimation from selected LCGA model in reporting scripts	`04_Estimates_*.html`
Posterior classification quality	posterior separation and assignment precision	posterior-class probability summaries and overlap routines in reporting scripts	`02_AvePP_.html`, `06c_Class_Overlap_Summary_.html` (joint models)
Class-selection decision (\(K\))	model-enumeration choice under fit and plausibility criteria	selection logic in `lcga_selection.R`; execution loop in `run_lcga.R`	`01_fit_summary_*.html`, model-selection row highlighted in tables
Sensitivity diagnostics	misspecification and reporting-standard checks	sensitivity/checklist builders in reporting scripts	`07_Misspecification_Sensitivity_.html`, `08_GRoLTS_Checklist_.html`
\(Y_i^H\) contrast model	\(\operatorname{logit}\Pr(Y_i^H=1)=\beta_0+\sum_{c\ne c_{ref}}\beta_c\mathbb{1}(C_i=c)+\boldsymbol{\gamma}^\top\mathbf{Z}_i\)	class-linked outcome association workflow using selected class structures and confounder adjustment	Objective 3 association tables/figures in trajectory deliverables and SAP-linked outputs

The crosswalk above is restricted to Objective 3 trajectory components and does not include Objective 5 mediation decomposition modules.

B10 Consistency Stamp (Objective 3)

Consistency stamp date: 2026-03-30
Scope: Objective 3 (Social Determinant Trajectories) only.

This consistency stamp records that, at this release checkpoint:

Supplementary glossary definitions and Section B3 estimands are aligned with the active trajectory implementation path in the Objective 3 pipeline.
The active model family set is explicitly constrained to ULWOM, ULWOF, UHSES, JPA, and JPAES; retired specifications (UHC14, JPAC, JPACES) are excluded from active SAP compliance checks.
Class-selection, posterior-quality diagnostics, sensitivity outputs, and trajectory reporting artifacts are generated in a consistent naming framework (01_ through 08_ table series per active model-window combination).
Remaining Objective 3 close-out work is documentation-depth only (parameter-level traceability and versioned sign-off), not redesign of the analytical model family.

Section C: Objective 4 - Behavioral Mediators SAP

C1 Background and Rationale

Objective 4 identifies behavioral pathways through which childhood social determinant trajectories may influence depression risk. This section defines mediator-screening and mediator-structuring analyses that precede full causal mediation decomposition.

C3 Objectives and Estimands

Primary estimands are adjusted mediator-outcome associations conditional on exposure trajectory class and confounders.

For mediator \(M_j\):

\[ \operatorname{logit}\Pr(Y_i^H=1) = \alpha_0 + \alpha_1 M_{ij} + \alpha_2 C_i + \boldsymbol{\alpha}_3^\top \mathbf{Z}_i. \]

Parameter definitions (Eq. C3.1):

\(\Pr(Y_i^H=1)\): probability of harmonized depression for participant \(i\).
\(\alpha_0\): model intercept.
\(\alpha_1\): association parameter for mediator \(M_{ij}\).
\(M_{ij}\): value of mediator \(j\) for participant \(i\).
\(\alpha_2\): effect parameter for trajectory representation \(C_i\).
\(C_i\): trajectory class/exposure representation.
\(\mathbf{Z}_i\): confounder vector.
\(\boldsymbol{\alpha}_3\): confounder-effect coefficient vector.
\(\operatorname{logit}(\cdot)\): log-odds transformation, \(\log\{p/(1-p)\}\).
\(\boldsymbol{\alpha}_3^\top \mathbf{Z}_i\): linear predictor contribution of measured confounders.

Secondary estimands include mediator inter-association structure and ranking stability across adjustment sets.

C4 Mediator Definitions and Coding

Candidate mediators are operationalized from harmonized adolescent behavioral domains (e.g., school attachment, social support, violence exposure, food insecurity, and related behavioral indicators). Recoding, scaling, and missingness handling follow pre-specified rules from the governed data-preparation pipeline.

C5 Modeling Strategy

Three layers are prespecified:

unadjusted models (\(Y_i^H \sim M_{ij}\)),
confounder-adjusted models (\(Y_i^H \sim M_{ij} + \mathbf{Z}_i\)),
jointly adjusted mediator models (\(Y_i^H \sim \mathbf{M}_i + \mathbf{Z}_i\)).

Model-term definitions: in these compact model expressions, \(Y_i^H\) is the harmonized depression endpoint, \(M_{ij}\) is a single mediator for participant \(i\), \(\mathbf{M}_i\) is the vector of mediators entered jointly, \(\mathbf{Z}_i\) is the confounder vector, and the symbol \(\sim\) indicates the set of predictors included in the working regression model.

Given potential mediator collinearity, interpretation emphasizes effect-direction consistency, precision, and stability under alternate adjustment sets rather than single-model significance alone.

C6 Mediator Prioritization and Phenotype Planning

Mediators are prioritized for Objective 5 if they show:

stable adjusted associations with \(Y_i^H\),
coherent epidemiologic interpretation,
acceptable overlap structure for mediation decomposition.

Mediator phenotype summaries are treated as supportive representation tools and not substitutes for prespecified individual mediator effects.

C7 Sensitivity and Robustness

Sensitivity analyses include alternate coding schemes, complete-case vs missing-data-aware pipelines, and subgroup checks by age/sex strata. E-value style bias-sensitivity summaries may be reported for key associations where decision-relevant (Mathur et al. 2022; T. J. VanderWeele and Ding 2017).

C8 Planned Outputs and Reporting

Outputs include mediator-association tables (unadjusted and adjusted), mediator inter-association summaries, and a transparent shortlist of mediators carried into Objective 5 causal mediation modeling.

Supplementary: Glossary of Symbols and Parameters

This supplementary glossary consolidates symbol and parameter notation for quick reference. Equation-specific parameter definitions remain directly below each model/equation in Sections A-C.

Objective 2 Glossary

\(i\): participant index.
\(j\): item index.
\(k\): ordered-category threshold index for item \(j\).
\(Y_i^{PHQ}\): PHQ-referenced probable-depression indicator for participant \(i\).
\(Y_i^H\): harmonized depression indicator for participant \(i\).
\(\hat{\theta}_i\): participant-specific estimated latent depression score (EAP score from joint calibration).
\(\tau^\ast\): selected latent threshold used to classify \(Y_i^H\).
\(\Delta_p\): prevalence drift between harmonized and PHQ-referenced endpoint prevalence.
\(\mathbf{x}_i\): vector of observed ordinal item responses for participant \(i\).
\(\theta_i\): latent depression factor in structural models.
\(\boldsymbol{\lambda}\): factor-loading parameter vector (or matrix).
\(\boldsymbol{\tau}\): item-threshold parameter vector (or matrix) in CFA parameterization.
\(a_j\): GRM discrimination parameter for item \(j\).
\(b_{jk}\): GRM category-threshold (difficulty/location) parameter for item \(j\), threshold \(k\).

Objective 3 Glossary

\(i\): participant index.
\(t\): time index within developmental window.
\(c, h\): latent-class indices.
\(K\): total number of latent classes under a candidate model.
\(E_{it}\): observed exposure measure for participant \(i\) at time \(t\).
\(C_i\): latent trajectory class membership for participant \(i\).
\(\pi_c\): marginal class proportion for class \(c\).
\(Y_i^H\): harmonized binary depression outcome from Objective 2.
\(c_{ref}\): pre-specified reference class for class-contrast models.
\(\eta_{0c}\): class-specific intercept (initial level) parameter.
\(\eta_{1c}\): class-specific linear time-slope parameter.
\(\eta_{2c}\): class-specific quadratic time-slope parameter.
\(\varepsilon_{it}\): residual term for participant \(i\) at time \(t\).
\(\sigma_c^2\): class-specific residual variance.
\(\alpha_c\): multinomial logit intercept parameter governing class-membership probability.
\(\beta_0\): outcome-model intercept.
\(\beta_c\): class-effect coefficient for class \(c\) relative to \(c_{ref}\).
\(\mathbf{Z}_i\): confounder vector for participant \(i\).
\(\boldsymbol{\gamma}\): confounder-effect coefficient vector.
\(\text{OR}_c\): odds-ratio contrast in depression risk for class \(c\) versus \(c_{ref}\).

Objective 4 Glossary

\(i\): participant index.
\(j\): mediator index.
\(Y_i^H\): harmonized binary depression outcome.
\(M_{ij}\): value of mediator \(j\) for participant \(i\).
\(C_i\): trajectory-class summary or exposure-trajectory representation for participant \(i\).
\(\mathbf{Z}_i\): confounder vector.
\(\alpha_0\): model intercept.
\(\alpha_1\): regression coefficient for mediator \(M_{ij}\).
\(\alpha_2\): regression coefficient for trajectory-class representation \(C_i\).
\(\boldsymbol{\alpha}_3\): coefficient vector for confounders \(\mathbf{Z}_i\).
\(\mathbf{M}_i\): vector of candidate mediators in jointly adjusted models.

Remaining Sections

Section D: Objective 5 - Causal Mediation SAP (in progress). This section will formalize natural direct and indirect effect estimands, identification assumptions, and sensitivity procedures for unmeasured mediator-outcome confounding (Imai, Keele, and Yamamoto 2010; Valeri and VanderWeele 2013; Tyler J. VanderWeele and Hernan 2012; T. J. VanderWeele et al. 2016).

References

Andruff, H., N. Carraro, A. Thompson, P. Gaudreau, and B. Louvet. 2009. “Latent Class Growth Modelling: A Tutorial.” Tutorials in Quantitative Methods for Psychology 5: 11–24. https://doi.org/10.20982/tqmp.05.1.p011.

Asparouhov, T., and B. Muthen. 2014. “Auxiliary Variables in Mixture Modeling: Three-Step Approaches Using Mplus.” Structural Equation Modeling: A Multidisciplinary Journal 21 (3): 329–41. https://doi.org/10.1080/10705511.2014.915181.

Bakk, Z., D. L. Oberski, and J. K. Vermunt. 2016. “Relating Latent Class Membership to Continuous Distal Outcomes: Improving the LTB Approach and a Modified Three-Step Implementation.” Structural Equation Modeling: A Multidisciplinary Journal 23 (2): 278–89. https://doi.org/10.1080/10705511.2015.1049698.

Chalmers, R Philip. 2012. “Mirt: A Multidimensional Item Response Theory Package for the r Environment.” Journal of Statistical Software 48 (6): 1–29. https://doi.org/10.18637/jss.v048.i06.

Chibanda, Dixon et al. 2019. “Validation of the 8-Item Shona Symptom Questionnaire (SSQ-8) in Zimbabwe.” In AIDSImpact 2019 Conference. https://www.aidsimpact.com/abstracts/-KoQrAipsu5F2bWuNDBA.

Efron, Bradley, and Robert J Tibshirani. 1994. An Introduction to the Bootstrap. CRC press. https://doi.org/10.1201/9780429246593.

Haney, E., K. Singh, C. Nyamukapa, S. Gregson, L. Robertson, L. Sherr, and C. T. Halpern. 2014. “One Size Does Not Fit All: Psychometric Properties of the Shona Symptom Questionnaire (SSQ) Among Adolescents and Young Adults in Zimbabwe.” Journal of Affective Disorders 167: 358–67. https://doi.org/10.1016/j.jad.2014.06.015.

Holgado–Tello, Francisco Pablo, Salvador Chac’on–Moscoso, Isabel Barbero–Garc’ ia, and Enrique Vila–Abad. 2010. “Polychoric Versus Pearson Correlations in Exploratory and Confirmatory Factor Analysis of Ordinal Variables.” Quality & Quantity 44: 153–66. https://doi.org/10.1007/s11135-008-9190-y.

Hu, Li-tze, and Peter M Bentler. 1999. “Cutoff Criteria for Fit Indexes in Covariance Structure Analysis: Conventional Criteria Versus New Alternatives.” Structural Equation Modeling: A Multidisciplinary Journal 6 (1): 1–55. https://doi.org/10.1080/10705519909540118.

Imai, K., L. Keele, and T. Yamamoto. 2010. “Identification, Inference and Sensitivity Analysis for Causal Mediation Effects.” Statistical Science 25 (1): 51–71. https://doi.org/10.1214/10-STS321.

Kroenke, K., R. L. Spitzer, and J. B. Williams. 2001. “The PHQ-9: Validity of a Brief Depression Severity Measure.” Journal of General Internal Medicine 16 (9): 606–13. https://doi.org/10.1046/j.1525-1497.2001.016009606.x.

MacCallum, Robert C, Shaobo Zhang, Kristopher J Preacher, and Derek D Rucker. 2002. “On the Practice of Dichotomization of Quantitative Variables.” Psychological Methods 7 (1): 19. https://doi.org/10.1037/1082-989X.7.1.19.

Mathur, M. B., L. H. Smith, K. Yoshida, P. Ding, and T. J. VanderWeele. 2022. “E-Values for Effect Heterogeneity and Approximations for Causal Interaction.” International Journal of Epidemiology 51 (4): 1268–75. https://doi.org/10.1093/ije/dyac073.

Muthén, Bengt O. 2001. “Growth Curve Modeling.” International Encyclopedia of the Social & Behavioral Sciences, 6281–87.

Nagin, D. S. 2005. Group-Based Modeling of Development. Harvard University Press. https://doi.org/10.4159/9780674041318.

Nylund, Karen L, Tihomir Asparouhov, and Bengt O Muthén. 2007. “Deciding on the Number of Classes in Latent Class Analysis and Growth Mixture Modeling: A Monte Carlo Simulation Study.” Structural Equation Modeling: A Multidisciplinary Journal 14 (4): 535–69.

Patel, Vikram, Essie Simunyu, Fungisai Gwanzura, Glyn Lewis, and Anthony H Mann. 1997. “The Shona Symptom Questionnaire: The Development of an Indigenous Measure of Common Mental Disorders in Harare.” Acta Psychiatrica Scandinavica 95 (6): 469–75.

Putnick, Diane L, and Marc H Bornstein. 2016. “Measurement Invariance Conventions and Reporting: The State of the Art and Future Directions for Psychological Research.” Developmental Review 41: 71–90. https://doi.org/10.1016/j.dr.2016.06.004.

Samejima, Fumiko. 1969. “Estimation of Latent Ability Using a Response Pattern of Graded Scores.” Psychometrika Monograph Supplement 17 (4): 1–100. https://doi.org/10.1007/BF02290599.

Schwarz, Gideon. 1978. “Estimating the Dimension of a Model.” The Annals of Statistics 6 (2): 461–64.

Valeri, L., and T. J. VanderWeele. 2013. “Mediation Analysis Allowing for Exposure-Mediator Interactions and Causal Interpretation: Theoretical Assumptions and Implementation with SAS and SPSS Macros.” Psychological Methods 18 (2): 137–50. https://doi.org/10.1037/a0031034.

Vandenberg, R. J., and C. E. Lance. 2000. “A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research.” Organizational Research Methods 3 (1): 4–70. https://doi.org/10.1177/109442810031002.

VanderWeele, T. J., and P. Ding. 2017. “Sensitivity Analysis in Observational Research: Introducing the e-Value.” Annals of Internal Medicine 167 (4): 268–74. https://doi.org/10.7326/M16-2607.

VanderWeele, T. J., Y. Li, A. C. Tsai, and I. Kawachi. 2016. “Outcome-Wide Longitudinal Designs for Causal Inference: A New Template for Empirical Studies in Public Health.” Statistical Science 31 (6): 2016. https://doi.org/10.1214/16-STS581.

VanderWeele, Tyler J, and Miguel A Hernan. 2012. “Invited Commentary: Causal Diagrams and Measurement Bias.” American Journal of Epidemiology 175 (12): 1303–10.

Woods, Carol M. 2009. “Empirical Histograms in Item Response Theory with Differential Item Functioning.” Educational and Psychological Measurement 69 (1): 102–25. https://doi.org/10.1177/0013164408318760.

Wu, Hao, and Ryne Estabrook. 2016. “Identification of Confirmatory Factor Analysis Models of Different Levels of Invariance for Ordered Categorical Outcomes.” Psychometrika 81: 1014–45. https://doi.org/10.1007/s11336-016-9506-0.