Butterfly Effect

Institutional variable shortlist

The field-level handoff to send after a positive first reply.

This is the operational version of the data requirements pack: what to request first, what can be substituted safely, and which tables matter most for the current Butterfly Effect bottlenecks.

Use order

The project does not need every modality on day one. It needs a clean participant-night core, repeated burden outcomes, and just enough context to avoid false interpretation.

Tier 1: route-ready Tier 2: transport-ready Tier 3: physiology-rich
Highest-value table
`night_sleep`
Highest-value outcome
`stress_0_10`
Main unlock
Night transport

Send first

  • `participant.csv`
  • `night_sleep.csv`
  • one repeated outcome table
  • `day_confounders.csv` if available
  • raw sidecars after join validation

Tables

The smallest defensible institutional handoff

These are the tables that matter most if the current goal is one serious pilot rather than a broad exploratory data grab.

`participant`

  • stable pseudonymized ID
  • source lineage
  • age and sex
  • timezone if possible

`night_sleep`

  • participant-night key
  • sleep duration
  • continuity
  • HR or HRV
  • movement

`day_emotion`

  • local report date
  • stress preferred
  • GAD-7 or PHQ-9 if available
  • major stressor flag if available

`day_symptoms`

  • fatigue
  • pain
  • muscle tension
  • rest quality

`day_confounders`

  • caffeine
  • alcohol
  • exercise
  • naps
  • illness or medication changes

Rules

What can be substituted safely in a first pilot

These substitutions are acceptable only when they preserve participant-night linkage and keep the interpretation problem honest.

Acceptable substitutions

  • `sleep_midpoint_local` for missing onset time if offset or total sleep time exists
  • `heart_rate_mean_bpm` when raw IBI is not shareable
  • `fragmentation_index` when raw movement is not shareable
  • `gad7_total` and `phq9_total` over free-text questionnaire dumps

Do not compromise these

  • stable pseudonymized IDs
  • participant-night rows in `night_sleep`
  • local date logic
  • explicit next-day linkage
  • documented units and device source

Templates

Header-only CSVs are ready to send

These are useful when an institution wants a direct example of expected table shape before any transfer starts. For the current `raw nightly stress` lane, the raw physiology templates and mapping JSONs are the important additions.