Skip to content

naluma-app-editor-copilot — E2E Test Report

Date: 2026-06-09 · Plugin: naluma-app-editor-copilot@naluma-ai · Method: one real content item produced end-to-end through every shipped skill, each handoff checked against its authoritative oracle (Directus schema, DB CHECK constraints, editorial standard, ai-patterns). Subject: insight session "The Habituation Model" (PSY-03) — chosen for real source material, the longest skill chain, and an illustrate dependency. Batch label: copilot-test-2026-06-09.

Outcome in one line

The text production chain works well (research → write-session → review → humanize produce schema-valid, well-grounded, on-voice content); the publish/media path is architecturally broken (images + audio must go direct to R2, not via Directus) and the offline gates under-validate (editorial placement fields aren't checked against the DB constraints, producing a false green). No dev rows were written (stopped at the publish gate by choice).

Per-step results

# Skill Result Notes
1 naluma-context ✅ PASS paths resolve; --verify green
2 research ✅ PASS dossier grounded in real PMIDs (Umashankar 2025, Gold 2021, Henry 2023, Thompson 2017); no fabrication; flagged the PMC8632517 gap
3 write-session ✅ PASS (content) content payload validates vs insight.schema.json; 4 cards in-band; correct flags. But editorial placement fields wrong (see I/J)
4 illustrate ✅ PASS enforced category gate; 2 distinct scenes; in-palette square WebP; provenance written
5 review ✅ PASS advisory, read-only, dossier-traced; factuality 88 / style 95; sidecar written
6 humanize ✅ PASS idempotent (0 banned patterns; prose byte-identical)
7 pipeline ⚠️ PARTIAL routing correct; --item path bug (G) crashes documented usage
8 publish ⛔ BLOCKED offline content-validation passed, but parent-row vocab unchecked (K), program_week/pain_points invalid as drafted (I/J), and media→R2 transport is wrong (M)

Findings

ID Finding Class Repo Sev Issue
A research SKILL.md says naluma-context "not yet shipped" (it shipped, PR #116) stale-instruction marketplace Low folded into cleanups
B+C session-insight.md mandates grounding in PMC8632517/evidence-base; research educational mode forbids non-article sources → conflicting canon cross-repo inconsistency app-content ↔ marketplace Med filed
D illustrate correctly refused on unconfirmed category — (positive)
E OPENAI_API_KEY has no documented local provisioning (Fly-only); blocks illustrate locally env/doc gap naluma-root Med filed
F Flat assets/illustrations/ + split research/+sessions/ scatter one item across 3 trees → colocate per-item folders IA improvement app-content (+marketplace) Med filed
G pipeline.py --item not resolved relative to --root → documented usage crashes (absolute path works) bug marketplace Low-Med filed
H publish accepts status: humanized, bypassing the final editor sign-off the lifecycle mandates consistency marketplace Low folded into cleanups
I write-session emits program_week: null; schema is NOT NULL DEFAULT 1 (1 = always available) → should default to 1 schema-drift marketplace Med filed (w/ J)
J write-session/dossier emit pain_points+segment_affinity as free-text, not the DB controlled vocab → CHECK rejects the upsert bug/schema-drift marketplace Med-High filed (w/ I)
K publish validate.py doesn't validate parent-row editorial fields vs DB CHECK constraints → false green, fails server-side gate-gap marketplace Med-High filed
L publish curl uses $DIRECTUS_URL/$DIRECTUS_TOKEN; actual env names are NALUMA_DIRECTUS_MCP_URL/_TOKEN doc bug marketplace Low folded into cleanups
M publish media transport wrong: image and audio must be S3-PUT direct to the R2 bucket + a content.assets/audio_assets row (r2_key), not Directus /files. Reference: naluma-app/tools/content/seed-dummy architecture marketplace (+docs) High filed
N Directus MCP role can't read schema (directus_collections FORBIDDEN) → publish must rely on committed snapshot.yaml/JSON schemas, not MCP introspection infra/doc marketplace/directus Low folded into cleanups
App-side: relax content.sessions.pain_points so unmapped content can be unset content-model naluma-app Med #624 (filed)

What we did NOT do

  • No dev write — stopped at the publish gate; no content.sessions rows, no R2 objects, nothing to clean up in admin.
  • No coach conversation run (the second, gotcha-heavy case) — deferred; the session loop came first.
  • No real R2 upload — that's the publish-skill redesign (Issue M), not something to improvise on the paid bucket.

Artifacts produced (working files, uncommitted)

  • research/insight/the-habituation-model.md — the dossier
  • sessions/insight/the-habituation-model.md — a valid, publishable insight draft (status: humanized, scores 88/95)
  • sessions/insight/the-habituation-model.review.md — review sidecar (gitignored)
  • assets/illustrations/sessions/the-habituation-model.webp — hero image (gitignored)
  • ~/.naluma-env/openai.env — local OpenAI key (created during the run; provisions illustrate)

Recommendation

The text chain is production-ready. Prioritise the publish redesign (M, High) — mirror seed-dummy's direct-R2 + content.assets pattern for image and audio — then the validation gap (K) and write-session placement (I/J), which together make publish currently produce a row the DB rejects. The colocation refactor (F) is a good quality-of-life follow-up once publish is sound.

Decisions (2026-06-09 discussion) → issues adjusted

  • Q1 — grounding. Insights (educational mode only) use article-as-ground-truth: ground in the finished, quality_gates_passed article, inherit its citations verbatim, no PMID re-verification; review factuality becomes article-fidelity. Technique mode unchanged. → naluma-app-content#19 (standard rewrite) + naluma-ai-marketplace#128 (research+review contracts).
  • Q2 — card guidance. Adopt 3–8 cards (was 3–5); harden in insight.schema.json (minItems 3 / maxItems 8) + review.py card-count check; validate the 40–80 word band on iPhone SE. → naluma-app-content#21.
  • Q3 — read-more link. Add a conditional "Read the full article" link as a learn-more/footer affordance (in-app browser), not a terminal card; needs a source-article field on the content model + app UI + copilot carry-forward + analytics event. → naluma-app#625.