Sound library — positioning, value props & sourcing strategy¶

Date: 2026-06-15 Scope: The research-driven definition of Naluma's ambient sound catalog — how it is structured (albums), what each album is for (value prop), how it is positioned against competitors, and how the audio gets sourced. The catalog itself is now the per-item sounds/<album>/album.md + <track>/sound.md files (the old sounds-backlog.csv roadmap was retired in #131 → sounds-backlog.superseded.csv); this doc holds the strategy those files can't. Built from three parallel research passes (competitor sound libraries; YouTube demand; ElevenLabs/generative sourcing) plus the EarnoiseCare catalog as a positioning reference. Complements docs/sound-library-research.md (clinical evidence by category) and the docs/research/{notched,fractal}-*.md dossiers.

1. The wedge — how Naluma's library beats the field¶

Competitor scan (ReSound Relief, Widex Zen, Kalmeda, Oto, myNoise, EarnoiseCare; plus Calm/Headspace/Endel/Portal/BetterSleep/Noisli/Dark Noise) surfaced five openings:

Spatial-by-default (planned, not yet shipped). The architecture is designed to bake binaural spatialization into the primary master (decision #687 Option A) so every sound can ship in calming 3D — but the HRTF bake is not yet implemented (spatial.render: baked raises NotImplementedError; MVP renders spatial.render: none, i.e. plain stereo). It is a near-term differentiator to build, not a current capability. No tinnitus competitor ships spatial as standard; the spatial apps (Endel, Portal, Headspace) aren't tinnitus-focused and treat it as a premium novelty — so the opening is real, but realizing it depends on naluma-app#687.
Honest, cited tinnitus framing. Name both mechanisms truthfully — masking (cover the contrast) and habituation/retraining (partial masking so the brain stops flagging the signal) — in plain language with real sources. Portal markets to tinnitus with zero evidence; commodity apps just disclaim; EarnoiseCare uses the right dual register but no citations. We can do it honestly (matches our per-session citation discipline).
Depth where it counts. Own Rain (a rain for every preference — the #1 sought sound and a whole family) and Noise Colours (brown flagship for focus/ADHD, violet for high-pitched tinnitus) rather than a shallow-broad catalog. EarnoiseCare's strongest album is depth (7 graded white-noise variants); depth, not breadth, is the differentiator.
Intent-split naming (the myNoise model): literal where utility matters (noise colours, any future notch/frequency tooling), warm-and-concrete elsewhere.
Layering, framed for tinnitus (planned — see §6): "build your masking blend," not just a sleep mixer. Combination scenes outperform single sounds in the sleep market, and no tinnitus app frames layering around partial masking for habituation.

Avoid the crowded/over-claimed lanes: generic "relax/sleep/focus," self-published neuroscience claims (Endel, Brain.fm), and "mix your own sound" as a bare feature. Lead with spatial + evidence-honesty + tinnitus-true masking/habituation.

2. Catalog architecture (albums → value props)¶

Organized by source (the model used by the mixing/utility apps and EarnoiseCare), with multiple albums per library category for depth. Library categories are the album-level editorial tag (library_category ∈ water · nature · noise_colours · urban_calm · tone_therapy). Full track lists + metadata: the per-item sounds/<album>/album.md + <track>/sound.md files (historical roadmap: sounds/sounds-backlog.superseded.csv).

Album	Category	Value prop (job-to-be-done)	Tracks	Source	Wave
Noise Colours	noise_colours	Broadband masking across the spectrum (deep→bright); brown flagship, violet for high-pitched tinnitus; graded variants of each colour	11	Synth	1
Rain	water	The #1 sought masking + sleep sound; we own depth	8	CC0	1
Ocean & Shore	water	Rhythmic surf for sleep + relaxation	5	CC0	1
Fire	nature	Cozy crackle for sleep + comfort (197M demand)	4	CC0	1
Forest & Birdsong	nature	Daytime calm + relaxation; dawn chorus	5	CC0	2
Streams & Falls	water	Moving water for focus + calm	4	CC0	2
Focus & Places	urban_calm	Concentration ambiences (cafe/library/fan/train)	5	CC0	2
Night & Meadow	nature	Night sounds for sleep (crickets/evening)	4	CC0	3
Wind	nature	Airy masking + calm	3	CC0	3
Tone & Resonance	tone_therapy	Meditative tones for sleep/relaxation (bowls/chimes/drone)	5	CC0/synth	3

~10 albums / ~54 tracks. The count exceeds a flat "30" deliberately — the surplus is depth, the research's clearest differentiation lever, not breadth for its own sake.

Noise is one "Noise Colours" album with character-named spectral-variant tracks (the EarnoiseCare model — their strongest album is graded white-noise variants — but as one album, not a separate album per colour, since the niche colours don't warrant a standalone album each). Our synth_noise shapes the spectrum by a single continuous exponent β (amplitude ∝ f^β: brown −1.0 → pink −0.5 → white 0 → blue +0.5 → violet +1.0), so the variants are just points along that tilt and are trivial to generate. 11 tracks ordered deep→bright: Deep Brown · Brown · Soft Brown · Soft Pink · Pink · Warm White · White · Bright White · Blue · Violet · Green — depth on the flagships (brown/white; "smoothed/deep brown" is demand-backed), single entries for the niche colours. All synthesized, so the whole album is produced in Wave 1 (no sourcing bottleneck). Needs a small engine + contract change (see §4).

Use-case coverage¶

Every primary job is covered: sleep (rain, ocean, fire, brown noise, night), focus (brown noise, cafe, stream, library), relaxation (birdsong, waterfall, tone), tinnitus masking (noise colours incl. violet, rain, waterfall, fan). Each track carries a primary use_case in the backlog.

3. Naming convention¶

Warm-concrete, and as short as the app UI allows. Names must fit tight sound-tile/player space, so favor the shortest clear form — e.g. Rain on Tent (not "Rain on a Tent"), in German Regen auf Zelt (not "Regen auf ein Zelt").

Noise colours stay literal — White/Pink/Brown/Blue/Violet Noise (utility + findability).
Everything else is warm-concrete, not mystical — "Gentle Rain", "Cozy"-adjacent fire, "Dawn Chorus". We deliberately do not copy EarnoiseCare's poetic style ("Zauberquelle" / "Magic Well") — it clashes with Naluma's "plain before clinical" voice — but we are warmer than bare "Rain 01".
Per-locale brevity matters. Localized titles (the *.de.md/*.es.md sibling step) should re-optimize for brevity in each language, not translate literally.

4. Sourcing strategy (tiered, rights-first)¶

Confirmed by the generative-audio research: there is no single generator that cleanly produces a shippable standalone library. Use a tiered mix:

Synthesized noise colours (numpy FFT) — keep. Zero rights risk, zero cost, already in the audio_pipeline. Covers the Noise Colours album's spectral-variant tracks and a possible soft drone. Engine + contract change needed for the spectral variants: synth_noise currently accepts only the 5 fixed colour names (NOISE_BETA dict) and sound_lint hard-errors any colour outside those 5 — so variant β values (e.g. Deep Brown β≈−1.25, Warm White β≈−0.25) require (a) synth_noise to accept a numeric β or new named variants, and (b) sound_lint + docs/authoring/standards/sound-production.md to allow them. Green/Mid noise additionally needs a band-pass filter (it is mid-band emphasis, not a β tilt) — not in the pipeline today. These are small, well-scoped additions, but they make Wave 1 include a pipeline task, not just content.
CC0 field recordings — the primary source for naturalistic ambient (rain, ocean, fire, forest, birdsong, etc.). Cleanest commercial clearance, genuinely realistic; our seamless-loop crossfade pipeline already turns short CC0 sources into long loops. Sources: Freesound with the license:cc0 filter only, Free To Use Sounds, Pixabay Audio, OpenGameArt CC0 packs. Sourcing is the bottleneck — ~40 CC0 files must be found, license-vetted, and downloaded by a human (or a tightly-scoped sourcing pass), then run through write-sound + render + publish.
ElevenLabs SFX — NOT the library source. ⚠️ Its ToS forbids using generated sound effects "on a standalone basis … as isolated files, audio samples, sound libraries, or collections of sounds" — which is exactly a browsable sound library. Use ElevenLabs only for ambient beds baked under a guided session/narration (not an isolated library file) or for throwaway prototyping while real CC0 audio is sourced. Verify the exact ToS clause before relying on it even for beds.
Stable Audio Pro — evaluate as a generative fallback only if CC0 sourcing proves too slow; its commercial terms for generated tracks are cleaner than ElevenLabs' and clips are longer (≤90 s), but confirm its library-distribution terms first. (Meta AudioCraft/AudioGen is the best topical fit but CC-BY-NC — non-commercial — so ruled out for a shipped app.)

Per-track source_type in the backlog records the intended tier (synthesized / cc0 / cc0-or-synth).

5. Sequencing (waves)¶

Wave 1 — launch core (~4 albums, ~22 tracks): the full Noise Colours album (all ~11 variants — synthesized, no sourcing bottleneck; includes the engine/contract change above + reorganizing the existing noise-colours/ draft), plus Rain (core variants), Ocean & Shore, Fire. Highest-demand sounds + our synthesized strength; also the set that proves the CC0 sourcing → render → publish path (only synthesized noise has been proven end-to-end so far).
Wave 2: Forest & Birdsong, Streams & Falls, Focus & Places, + Rain depth variants (tent/roof/tropical/forest).
Wave 3: Night & Meadow, Wind, Tone & Resonance, niche extensions.

Each wave is its own production batch (research/source → write-sound → render → publish), gated like every other content batch.

6. Layering & combination scenes (planned feature)¶

Demand research is clear that combination scenes outperform single sounds ("thunderstorm + ocean", "fireplace + rain", "stream + birdsong"), and layering is table-stakes in the sleep market — but no tinnitus app frames layering around partial masking for habituation. That is an active, high-value opportunity, tracked as a feature (not merely deferred):

App: a "build your masking blend" mixer — pick 2–4 sounds, per-layer volume, save named blends. Frame it for tinnitus (partial masking → habituation), not just sleep.
Content/contract: the spatial contract already anticipates per-layer stems (spatial_stems, the deferred dynamic-spatial path) — layering and dynamic spatial are related and should be designed together.
Curated starter scenes: ship a few pre-mixed blends (e.g. Storm at Sea, Cabin Fire + Rain) as defaults.

Tracked as naluma-app#694 (Feature / High priority / High effort). This catalog (single sounds) is the substrate layering builds on; Wave 1 ships single sounds first.

7. Competitive positioning summary¶

Competitor	What they do	Where Naluma wins
EarnoiseCare	7 by-source albums, evocative bilingual names, dual masking+retraining framing, one-time €57 pack, no app	In-app + spatial (planned, #687) + real citations + layering; not a static download
myNoise	Huge catalog, intent-split naming, notch/peak/neuromod tools, PWYW	Curated depth + a guided programme around the sounds (myNoise is a tool, not a journey); spatial planned
ReSound Relief / Beltone	Masking + 5-sound layering, habituation framing, literal names	Evidence-honesty, depth in rain/noise, spatial (planned); not gated to hearing-aid owners
Kalmeda (DiGA)	CBT-first, minimal utilitarian sounds (Wind/Wasser/Wald/Stadt)	A real, deep sound library (sounds are first-class, not an afterthought)
Endel / Portal	Generative/spatial, premium escape, no tinnitus evidence	Tinnitus-true framing + masking depth; spatial without the novelty-premium gate
Calm / Headspace / BetterSleep	Big sleep libraries, layering, evocative names	Tinnitus-specific positioning + spatial + honest mechanism framing

8. Open questions¶

CC0 sourcing ownership/process: who sources the ~40 CC0 files (Jens manually vs. a scoped sourcing pass producing a vetted download list per write-sound's Gate-1)? This is the Wave-1 critical path.
Resolved (2026-06-17) by the sourcing-research pass: per-track candidate files are in docs/sound-library-sourcing-candidates.md (acceptance requirements: docs/authoring/standards/sound-sourcing.md). Re-ranked duration-first (2026-06-18) — see that doc's §"⭐ Picks v2": the original picks skewed short (loop-fatigue risk), so an open per-scene re-search added longer masters (e.g. rain-on-tent 17:22, distant-thunder 34:17, distant-ocean 27:12, summer-night 16:04, evening-meadow 20:40). A follow-up sweep converted the few CC-BY upgrades back to CC0 and solved the open Tone/meadow slots, so all primary v2 picks are CC0 (no attribution); only city-rain remains unsolved in clean CC0 (no long urban rain without speech/horns/thunder). Paid libraries were ruled out as a source: the consumer royalty-free industry (Pond5, Envato, Artlist, Epidemic, Storyblocks, Boom Library, Epic Stock Media, Listening Earth, …) almost universally forbids end-user file extraction (which offline caching requires) and/or demands the audio be synced with other media — the same structural reason Listening Earth was rejected, not a one-off. Only a custom/enterprise license (e.g. Epic Stock Media's Custom Application License) can fit, reserved for a deliberate marquee track. Sourcing order: CC0 → CC-BY (in-app credits) → ship fewer tracks.
Spatial bake for naturalistic ambient: noise colours are omnidirectional (no spatial); the CC0 nature sounds are where spatial-by-default actually applies — confirm the bake approach for stereo field recordings (relates to naluma-app#687).
Noise restructure is a Wave-1 prerequisite: the spectral-variant model needs the synth_noise + sound_lint + sound-production.md change (§2/§4); the existing noise-colours/ album draft stays, but its white/pink/brown track dirs are renamed (→ white/pink/brown) and the 8 new variant tracks are added under the same album. Plan it as a small pipeline + content task ahead of the Wave-1 build.
Publish target: dev-only today (r2_upload.py refuses prod buckets); a real prod publish needs prod R2/Directus creds — a prerequisite before Wave 1 goes live.
EarnoiseCare assets: the local example files cannot be used (licensing); positioning reference only.
Backlog vs production files: the per-item album.md/sound.md are the catalog source of truth. The old sounds-backlog.csv planning/roadmap was retired in #131 (→ sounds-backlog.superseded.csv; see sounds/SUPERSEDED.md) — the coach-schedule validator now derives valid sound slugs directly from the per-item files, not the CSV.

Sources¶

docs/sound-library-research.md (clinical evidence by category; competitor scan v1)
docs/research/notched-sound-therapy.md, docs/research/fractal-generative-sound-therapy.md
Competitor research (2026-06-15): EarnoiseCare (https://soundcloud.com/earnoisecare, https://tinnitushelfer.de/tinnitus-retraining-klaenge/), myNoise (https://mynoise.net), ReSound Relief, Widex Zen, Kalmeda, Oto, Calm, Headspace, Endel, Portal, BetterSleep, Noisli, Dark Noise.
YouTube demand research (2026-06-15): ranked soothing-sound view data (https://hrnews.co.uk/the-worlds-most-preferred-soothing-sounds-ranked/), Relaxing White Noise, the brown-noise trend, myNoise popularity statements, rain-variant + fan + train niches.
Generative/sourcing research (2026-06-15): ElevenLabs Sound Effects API + ToS (https://elevenlabs.io/docs/api-reference/text-to-sound-effects/convert, https://elevenlabs.io/terms-of-use), Stable Audio, Meta AudioCraft (CC-BY-NC).