Project Metadata

Metadata Source

This page is extracted and adapted from the dataset preparation source index.

Current scope is limited to two data dictionaries:

  • Childhood social exposure panel
  • Integrated multi-study mental health panel

All derived R objects for this page are generated locally during private site builds and are not distributed through the public website.

Dataset Summary

Field Value
Project CO-LUMINATE
Population South African youth aged 13-24
Linked sample size 5,226
Data source backbone AHRI HDSS
Core mental health outcomes Depression symptoms (SSQ-14, PHQ-9)
Priority social determinants SES and caregiver co-residency
Cohorts nested in AHRI HDSS DREAMS, Multilevel, Isisekelo Sempilo, TasP, Thetha Nami
Youth Co-Creators 11

Study Design Extract

The CO-LUMINATE linked dataset connects individual-level mental health outcomes from cohort studies to household-level longitudinal HDSS records. The objective is to estimate longitudinal direct and mediation relationships between social determinants and depressive symptoms among youth aged 13-24.

The analytical design combines:

  1. HDSS longitudinal household exposure histories.
  2. Cohort-based depression outcomes measured with SSQ-14 and PHQ-9.
  3. Linkage across five cohorts nested within AHRI HDSS.

Cohorts Included

  • DREAMS
  • Multilevel
  • Isisekelo Sempilo
  • Treatment as Prevention (TasP)
  • Thetha Nami

Object Snapshot

Dataset Rows Columns
Childhood Social Exposure Panel 52463 31
Integrated Multi-study Mental Health Panel 5616 33

CONSORT Flow

Participant-focused CONSORT logic: participants are retained only if they have both a depression endpoint and SDoH trajectory coverage.

The row/column snapshot above is based on the current metadata object build inputs. Dictionary definitions below are extracted from the dataset preparation master dictionary specification.

Data Dictionary: Childhood Social Exposure Panel (Variables of Interest by Role)

Variable Label Type Value Labels
Identifier
USUBJID Unique Subject Identifier numeric Min-Max: 48 to 260880
Time Variable
EXPAGE Exposure Age numeric Min-Max: 0 to 12
Exposure (SDoH Trajectory Input)
LIVEDWOM Lived Without Mother factor No; Yes
LIVEDWOF Lived Without Father factor No; Yes
HHCHLD14O Overcrowded (0–14y children > 3) factor No; Yes
HHSES Household Socioeconomic Status factor Low; Middle; High
Baseline Factor
MAGE Mother's Age numeric Min-Max: 11 to 82
MEDU Mother's Education factor None/Primary; Secondary; Completed Matric
MHIV Mother's HIV Status factor Error; Negative; Positive
MPTS Mother's Partner Status factor Missing/Refused; No partner; Married; Regular; Casual; Widowed/Separated
MMIG Mother's Ever Migrated (Previous 1 Year) character
FAGE Father's Age numeric Min-Max: 10 to 80
FEDU Father's Education factor None/Primary; Secondary; Completed Matric
FHIV Father's HIV Status factor Error; Negative; Positive
FPTS Father's Partner Status factor Missing/Refused; No partner; Married; Regular; Casual; Widowed/Separated
FMIG Father's Ever Migrated (Previous 1 Year) character

Data Dictionary: Integrated Multi-study Mental Health Panel (Variables of Interest by Role)

Variable Label Type Value Labels
Identifier
USUBJID Unique Subject Identifier numeric Min-Max: 48 to 260880
Outcome
DPBN Depressed factor No; Yes
Mediator
DNKA Drank Alcohol factor No; Yes
ESXC Ever Had Sex factor No; Yes
FDSC Food Insecure factor No; Yes
GOVG No Government Grant factor No; Yes
VLNC Experienced Violence factor No; Yes
SCSP Has Social Support factor No; Yes
SCHL In School factor No; Yes
Mediator-Outcome Confounder
SEX Sex factor Male; Female
AGE Age (Years) numeric Min-Max: 13 to 24
HIVS HIV Positive factor Negative; Positive
ORPH Orphanhood Status factor Both Parents Alive; One Parent Deceased; Both Parents Deceased
EXTMG External migration factor No; Yes
RURAL Reside in Rural Area factor No; Yes
CLYR Calendar year numeric Min-Max: 0 to 10

Data Dictionary: Integrated Multi-study Mental Health Panel (Derived Variables)

Univariate and Joint LCGA models were fitted in the private analytical pipeline; the derived-variable summary below reflects those model outputs in a transposed format for readability.

LCGA Model Inventory

Model Family Time Window Derived Output Public Label
Univariate LCGA Early childhood (0-5 years) Early-window class assignments ELCDV
Univariate LCGA Middle-late childhood (6-12 years) Middle-late-window class assignments MDCDV
Joint LCGA 0–5 years Joint trajectory class assignments for early childhood window ELCDV (joint)
Joint LCGA 6–12 years Joint trajectory class assignments for middle-late childhood window MDCDV (joint)
Field Value
DPBN
Label Depressed
Type factor
Value Labels No; Yes
Source Variables PHQBIN, irt_joint_models.rds
Derivation PHQBIN >= 10
Status Implemented in dataset preparation merge
ELCDV
Label Early Childhood Exposure Trajectory Class (Ages 0-5)
Type character
Value Labels Character class label from the selected trajectory solution
Source Variables LCGA trajectory solution (0-5 years): parental absence + household SES
Derivation Assigned from the selected trajectory model for early childhood
Status Implemented in the childhood class-derivation workflow
MDCDV
Label Middle-Late Childhood Exposure Trajectory Class (Ages 6-12)
Type character
Value Labels Character class label from the selected trajectory solution
Source Variables LCGA trajectory solution (6-12 years): parental absence + household SES
Derivation Assigned from the selected trajectory model for middle-late childhood
Status Implemented in the childhood class-derivation workflow

Dataset Name Reference

Dataset Name Readable Name
dt_childhood_exposure Childhood Social Exposure Panel
dt_multistudies Integrated Multi-study Mental Health Panel
dt_psychometric Psychometric Bridge Source Data
dt_dreams_multi_ssq SSQ-only Cohort Response Data