Validation

Validation has to stay explicit, layered, and scoped.

This page separates what the current evidence can support from what it cannot, and keeps the main bottlenecks visible before any interpretive claim is made.

Claim boundary

The first job of validation is to fence the claim.

These three panels define the boundary conditions of the project before you look at any deeper metric: what is defensible, what is out of scope, and what still blocks transport.

Claims we can defend

  • Nightly sleep routes are usable within cohort on selected stress, fatigue, and pain-like targets when governance marks them accepted.
  • Portable event stress is real and reproducible across open challenge cohorts.
  • Daily temporal memory improves anxiety and depression within cohort.
  • Anxiety is the strongest non-stress portable family, but it remains research-only.

Claims we should not make

  • Universal nightly cross-cohort clinical prediction.
  • Portable depression as a stable cross-cohort family.
  • Clinical physical pain prediction from the current somatic-burden proxy family.
  • Any diagnostic or treatment claim.

Main bottlenecks

  • Nightly cross-cohort transport remains the main bottleneck.
  • Non-stress families are limited more by transport and outcome equivalence than by infrastructure.
  • Event and nightly layers must stay separated in communication and governance.

Filter

Read validation by stress, non-stress, or portability.

This is a reading aid, not a claim change. It lets the public view isolate the strongest lane, the weaker non-stress lane, or the transport problem without mixing them together.

Capability matrix

Within-cohort strength, transport, and deployment are different questions.

The matrix keeps those dimensions separate on purpose. A layer can be strong within cohort and still fail cross-cohort transport or remain unsuitable for deployment.

Layer Within cohort Cross-cohort transport Deployment stance
Night Layer Novartis D0 fatigue contextual interpretation
High · Held-out R² 0.4350
Limited
Operational
Event Layer Portable family stress event
High · R² 0.3164
Portable
Separate stack
Daily Challenge Layer Boston anxiety D0 temporal
Moderate · R² 0.1685
Challenge only
Challenge
Non-Stress Daily Layer Novartis depression D0 temporal
High · R² 0.3454
Challenge only
Mixed support
Non-Stress Portable Families Anxiety
Low · Portable mean R² 0.0202
Emerging
Research only

Signal surface

The visible signal should be legible before the narrative begins.

These bars summarize a representative visible metric on each layer. They are for fast reading and orientation, not for pretending that all metric families are directly comparable.

Night Layer Novartis D0 fatigue contextual interpretation
Held-out R² 0.4350
Event Layer Portable family stress event
0.3164
Daily Challenge Layer Boston anxiety D0 temporal
0.1685
Non-Stress Daily Layer Novartis depression D0 temporal
0.3454
Non-Stress Portable Families Anxiety
Portable mean R² 0.0202

Portable-family update

The nightly portable stress layer moved, but not enough to change the claim boundary.

The new blended contract improved `stress_d1` transport while keeping replay positive. That is a real methodological improvement. It is not a promotion event, because the route still sits below the held-out signal floor required for stronger public claims.

Stress D+1 portable family

  • ArtifactNow deployed as `weighted_pipeline_blend` under the governed `portable_family_blend_v1` contract.
  • TransportPortable mean R² now reads `0.0685`, with replay held-out R² `0.0291`.
  • StatusStill `experimental`; only `heldout_signal_floor` remains as a blocking gate.

Stress D0 portable family

  • Probe resultThe new blend probe did not improve the currently deployed route.
  • Decision`stress_d0` remains on its current single-model regime.
  • InterpretationThe blend contract is now standard, but adoption remains evidence-driven rather than symmetrical.

Layers

Validation should stay tied to the live analytical layers.

Each layer below carries its own accepted, experimental, or support status, together with the strongest visible signal currently exposed in the repo.

night

Night Layer

4 accepted 7 experimental accepted

Nightly routes are the core operational layer. They carry held-out signal, interval calibration, governance, and deployment status.

Best visible signal Novartis D0 fatigue contextual interpretation
heldout_r20.4350
heldout_mae1.7972
heldout_rmse2.2791
coverage_pct80.2118
interval_width5.1123
transport_r2-0.1106

event

Event Layer

3 accepted 2 experimental accepted

Event models remain separate from nightly interpretation. They are portable for acute physiological stress, not for next-day sleep claims.

Best visible signal Portable family stress event
mae0.6775
rmse0.8182
r20.3164

daily

Daily Challenge Layer

2 accepted 4 experimental accepted_challenge

Daily challenge cohorts show where temporal memory helps within cohort, even when transport still fails across cohorts.

Best visible signal Boston anxiety D0 temporal
mae1.3178
rmse1.6673
r20.1685
rows20112
participants925
feature_count91
best_modelextra_trees

nonstress_daily

Non-Stress Daily Layer

6 accepted 4 experimental accepted_challenge

This layer keeps anxiety, depression, and somatic support visible without overstating clinical portability.

Best visible signal Novartis depression D0 temporal
mae1.2722
rmse1.6985
r20.3454
rows4818
participants196
feature_count292
best_modelhist_gbm

nonstress_family

Non-Stress Portable Families

0 accepted 1 experimental experimental

Portable non-stress families remain research references. Anxiety is closest to usable; depression and somatic burden are still below transport readiness.

Best visible signal Anxiety
portable_mean_r20.0202
route_replay_r2-0.0466
coverage_pct79.8680
rows303
participants103
feature_count10