Sound library — positioning, value props & sourcing strategy¶
Date: 2026-06-15 Scope: The research-driven definition of Naluma's ambient sound catalog — how it is structured (albums), what each album is for (value prop), how it is positioned against competitors, and how the audio gets sourced. The catalog itself is now the per-item
sounds/<album>/album.md+<track>/sound.mdfiles (the oldsounds-backlog.csvroadmap was retired in #131 →sounds-backlog.superseded.csv); this doc holds the strategy those files can't. Built from three parallel research passes (competitor sound libraries; YouTube demand; ElevenLabs/generative sourcing) plus the EarnoiseCare catalog as a positioning reference. Complementsdocs/sound-library-research.md(clinical evidence by category) and thedocs/research/{notched,fractal}-*.mddossiers.
1. The wedge — how Naluma's library beats the field¶
Competitor scan (ReSound Relief, Widex Zen, Kalmeda, Oto, myNoise, EarnoiseCare; plus Calm/Headspace/Endel/Portal/BetterSleep/Noisli/Dark Noise) surfaced five openings:
- Spatial-by-default (planned, not yet shipped). The architecture is designed to bake
binaural spatialization into the primary master (decision #687 Option A) so every sound can ship
in calming 3D — but the HRTF bake is not yet implemented (
spatial.render: bakedraisesNotImplementedError; MVP rendersspatial.render: none, i.e. plain stereo). It is a near-term differentiator to build, not a current capability. No tinnitus competitor ships spatial as standard; the spatial apps (Endel, Portal, Headspace) aren't tinnitus-focused and treat it as a premium novelty — so the opening is real, but realizing it depends on naluma-app#687. - Honest, cited tinnitus framing. Name both mechanisms truthfully — masking (cover the contrast) and habituation/retraining (partial masking so the brain stops flagging the signal) — in plain language with real sources. Portal markets to tinnitus with zero evidence; commodity apps just disclaim; EarnoiseCare uses the right dual register but no citations. We can do it honestly (matches our per-session citation discipline).
- Depth where it counts. Own Rain (a rain for every preference — the #1 sought sound and a whole family) and Noise Colours (brown flagship for focus/ADHD, violet for high-pitched tinnitus) rather than a shallow-broad catalog. EarnoiseCare's strongest album is depth (7 graded white-noise variants); depth, not breadth, is the differentiator.
- Intent-split naming (the myNoise model): literal where utility matters (noise colours, any future notch/frequency tooling), warm-and-concrete elsewhere.
- Layering, framed for tinnitus (planned — see §6): "build your masking blend," not just a sleep mixer. Combination scenes outperform single sounds in the sleep market, and no tinnitus app frames layering around partial masking for habituation.
Avoid the crowded/over-claimed lanes: generic "relax/sleep/focus," self-published neuroscience claims (Endel, Brain.fm), and "mix your own sound" as a bare feature. Lead with spatial + evidence-honesty + tinnitus-true masking/habituation.
2. Catalog architecture (albums → value props)¶
Organized by source (the model used by the mixing/utility apps and EarnoiseCare), with
multiple albums per library category for depth. Library categories are the album-level
editorial tag (library_category ∈ water · nature · noise_colours · urban_calm · tone_therapy).
Full track lists + metadata: the per-item sounds/<album>/album.md + <track>/sound.md files
(historical roadmap: sounds/sounds-backlog.superseded.csv).
| Album | Category | Value prop (job-to-be-done) | Tracks | Source | Wave |
|---|---|---|---|---|---|
| Noise Colours | noise_colours | Broadband masking across the spectrum (deep→bright); brown flagship, violet for high-pitched tinnitus; graded variants of each colour | 11 | Synth | 1 |
| Rain | water | The #1 sought masking + sleep sound; we own depth | 8 | CC0 | 1 |
| Ocean & Shore | water | Rhythmic surf for sleep + relaxation | 5 | CC0 | 1 |
| Fire | nature | Cozy crackle for sleep + comfort (197M demand) | 4 | CC0 | 1 |
| Forest & Birdsong | nature | Daytime calm + relaxation; dawn chorus | 5 | CC0 | 2 |
| Streams & Falls | water | Moving water for focus + calm | 4 | CC0 | 2 |
| Focus & Places | urban_calm | Concentration ambiences (cafe/library/fan/train) | 5 | CC0 | 2 |
| Night & Meadow | nature | Night sounds for sleep (crickets/evening) | 4 | CC0 | 3 |
| Wind | nature | Airy masking + calm | 3 | CC0 | 3 |
| Tone & Resonance | tone_therapy | Meditative tones for sleep/relaxation (bowls/chimes/drone) | 5 | CC0/synth | 3 |
~10 albums / ~54 tracks. The count exceeds a flat "30" deliberately — the surplus is depth, the research's clearest differentiation lever, not breadth for its own sake.
Noise is one "Noise Colours" album with character-named spectral-variant tracks (the
EarnoiseCare model — their strongest album is graded white-noise variants — but as one album, not
a separate album per colour, since the niche colours don't warrant a standalone album each). Our
synth_noise shapes the spectrum by a single continuous exponent β (amplitude ∝ f^β: brown −1.0 →
pink −0.5 → white 0 → blue +0.5 → violet +1.0), so the variants are just points along that tilt
and are trivial to generate. 11 tracks ordered deep→bright: Deep Brown · Brown · Soft Brown ·
Soft Pink · Pink · Warm White · White · Bright White · Blue · Violet · Green — depth on the
flagships (brown/white; "smoothed/deep brown" is demand-backed), single entries for the niche
colours. All synthesized, so the whole album is produced in Wave 1 (no sourcing bottleneck).
Needs a small engine + contract change (see §4).
Use-case coverage¶
Every primary job is covered: sleep (rain, ocean, fire, brown noise, night), focus (brown
noise, cafe, stream, library), relaxation (birdsong, waterfall, tone), tinnitus masking
(noise colours incl. violet, rain, waterfall, fan). Each track carries a primary use_case in
the backlog.
3. Naming convention¶
Warm-concrete, and as short as the app UI allows. Names must fit tight sound-tile/player space, so favor the shortest clear form — e.g. Rain on Tent (not "Rain on a Tent"), in German Regen auf Zelt (not "Regen auf ein Zelt").
- Noise colours stay literal — White/Pink/Brown/Blue/Violet Noise (utility + findability).
- Everything else is warm-concrete, not mystical — "Gentle Rain", "Cozy"-adjacent fire, "Dawn Chorus". We deliberately do not copy EarnoiseCare's poetic style ("Zauberquelle" / "Magic Well") — it clashes with Naluma's "plain before clinical" voice — but we are warmer than bare "Rain 01".
- Per-locale brevity matters. Localized titles (the
*.de.md/*.es.mdsibling step) should re-optimize for brevity in each language, not translate literally.
4. Sourcing strategy (tiered, rights-first)¶
Confirmed by the generative-audio research: there is no single generator that cleanly produces a shippable standalone library. Use a tiered mix:
- Synthesized noise colours (numpy FFT) — keep. Zero rights risk, zero cost, already in the
audio_pipeline. Covers the Noise Colours album's spectral-variant tracks and a possible soft drone. Engine + contract change needed for the spectral variants:synth_noisecurrently accepts only the 5 fixed colour names (NOISE_BETAdict) andsound_linthard-errors anycolouroutside those 5 — so variant β values (e.g. Deep Brown β≈−1.25, Warm White β≈−0.25) require (a)synth_noiseto accept a numeric β or new named variants, and (b)sound_lint+docs/authoring/standards/sound-production.mdto allow them. Green/Mid noise additionally needs a band-pass filter (it is mid-band emphasis, not a β tilt) — not in the pipeline today. These are small, well-scoped additions, but they make Wave 1 include a pipeline task, not just content. - CC0 field recordings — the primary source for naturalistic ambient (rain, ocean, fire,
forest, birdsong, etc.). Cleanest commercial clearance, genuinely realistic; our seamless-loop
crossfade pipeline already turns short CC0 sources into long loops. Sources: Freesound with the
license:cc0filter only, Free To Use Sounds, Pixabay Audio, OpenGameArt CC0 packs. Sourcing is the bottleneck — ~40 CC0 files must be found, license-vetted, and downloaded by a human (or a tightly-scoped sourcing pass), then run throughwrite-sound+ render + publish. - ElevenLabs SFX — NOT the library source. ⚠️ Its ToS forbids using generated sound effects "on a standalone basis … as isolated files, audio samples, sound libraries, or collections of sounds" — which is exactly a browsable sound library. Use ElevenLabs only for ambient beds baked under a guided session/narration (not an isolated library file) or for throwaway prototyping while real CC0 audio is sourced. Verify the exact ToS clause before relying on it even for beds.
- Stable Audio Pro — evaluate as a generative fallback only if CC0 sourcing proves too slow; its commercial terms for generated tracks are cleaner than ElevenLabs' and clips are longer (≤90 s), but confirm its library-distribution terms first. (Meta AudioCraft/AudioGen is the best topical fit but CC-BY-NC — non-commercial — so ruled out for a shipped app.)
Per-track source_type in the backlog records the intended tier (synthesized / cc0 /
cc0-or-synth).
5. Sequencing (waves)¶
- Wave 1 — launch core (~4 albums, ~22 tracks): the full Noise Colours album (all ~11
variants — synthesized, no sourcing bottleneck; includes the engine/contract change above +
reorganizing the existing
noise-colours/draft), plus Rain (core variants), Ocean & Shore, Fire. Highest-demand sounds + our synthesized strength; also the set that proves the CC0 sourcing → render → publish path (only synthesized noise has been proven end-to-end so far). - Wave 2: Forest & Birdsong, Streams & Falls, Focus & Places, + Rain depth variants (tent/roof/tropical/forest).
- Wave 3: Night & Meadow, Wind, Tone & Resonance, niche extensions.
Each wave is its own production batch (research/source → write-sound → render → publish),
gated like every other content batch.
6. Layering & combination scenes (planned feature)¶
Demand research is clear that combination scenes outperform single sounds ("thunderstorm + ocean", "fireplace + rain", "stream + birdsong"), and layering is table-stakes in the sleep market — but no tinnitus app frames layering around partial masking for habituation. That is an active, high-value opportunity, tracked as a feature (not merely deferred):
- App: a "build your masking blend" mixer — pick 2–4 sounds, per-layer volume, save named blends. Frame it for tinnitus (partial masking → habituation), not just sleep.
- Content/contract: the spatial contract already anticipates per-layer stems
(
spatial_stems, the deferred dynamic-spatial path) — layering and dynamic spatial are related and should be designed together. - Curated starter scenes: ship a few pre-mixed blends (e.g. Storm at Sea, Cabin Fire + Rain) as defaults.
Tracked as naluma-app#694 (Feature / High priority / High effort). This catalog (single sounds) is the substrate layering builds on; Wave 1 ships single sounds first.
7. Competitive positioning summary¶
| Competitor | What they do | Where Naluma wins |
|---|---|---|
| EarnoiseCare | 7 by-source albums, evocative bilingual names, dual masking+retraining framing, one-time €57 pack, no app | In-app + spatial (planned, #687) + real citations + layering; not a static download |
| myNoise | Huge catalog, intent-split naming, notch/peak/neuromod tools, PWYW | Curated depth + a guided programme around the sounds (myNoise is a tool, not a journey); spatial planned |
| ReSound Relief / Beltone | Masking + 5-sound layering, habituation framing, literal names | Evidence-honesty, depth in rain/noise, spatial (planned); not gated to hearing-aid owners |
| Kalmeda (DiGA) | CBT-first, minimal utilitarian sounds (Wind/Wasser/Wald/Stadt) | A real, deep sound library (sounds are first-class, not an afterthought) |
| Endel / Portal | Generative/spatial, premium escape, no tinnitus evidence | Tinnitus-true framing + masking depth; spatial without the novelty-premium gate |
| Calm / Headspace / BetterSleep | Big sleep libraries, layering, evocative names | Tinnitus-specific positioning + spatial + honest mechanism framing |
8. Open questions¶
- CC0 sourcing ownership/process: who sources the ~40 CC0 files (Jens manually vs. a
scoped sourcing pass producing a vetted download list per
write-sound's Gate-1)? This is the Wave-1 critical path. - Resolved (2026-06-17) by the sourcing-research pass: per-track candidate files are in
docs/sound-library-sourcing-candidates.md(acceptance requirements:docs/authoring/standards/sound-sourcing.md). Re-ranked duration-first (2026-06-18) — see that doc's §"⭐ Picks v2": the original picks skewed short (loop-fatigue risk), so an open per-scene re-search added longer masters (e.g. rain-on-tent 17:22, distant-thunder 34:17, distant-ocean 27:12, summer-night 16:04, evening-meadow 20:40). A follow-up sweep converted the few CC-BY upgrades back to CC0 and solved the open Tone/meadow slots, so all primary v2 picks are CC0 (no attribution); only city-rain remains unsolved in clean CC0 (no long urban rain without speech/horns/thunder). Paid libraries were ruled out as a source: the consumer royalty-free industry (Pond5, Envato, Artlist, Epidemic, Storyblocks, Boom Library, Epic Stock Media, Listening Earth, …) almost universally forbids end-user file extraction (which offline caching requires) and/or demands the audio be synced with other media — the same structural reason Listening Earth was rejected, not a one-off. Only a custom/enterprise license (e.g. Epic Stock Media's Custom Application License) can fit, reserved for a deliberate marquee track. Sourcing order: CC0 → CC-BY (in-app credits) → ship fewer tracks. - Spatial bake for naturalistic ambient: noise colours are omnidirectional (no spatial); the CC0 nature sounds are where spatial-by-default actually applies — confirm the bake approach for stereo field recordings (relates to naluma-app#687).
- Noise restructure is a Wave-1 prerequisite: the spectral-variant model needs the
synth_noise+sound_lint+sound-production.mdchange (§2/§4); the existingnoise-colours/album draft stays, but its white/pink/brown track dirs are renamed (→ white/pink/brown) and the 8 new variant tracks are added under the same album. Plan it as a small pipeline + content task ahead of the Wave-1 build. - Publish target: dev-only today (
r2_upload.pyrefuses prod buckets); a real prod publish needs prod R2/Directus creds — a prerequisite before Wave 1 goes live. - EarnoiseCare assets: the local example files cannot be used (licensing); positioning reference only.
- Backlog vs production files: the per-item
album.md/sound.mdare the catalog source of truth. The oldsounds-backlog.csvplanning/roadmap was retired in #131 (→sounds-backlog.superseded.csv; seesounds/SUPERSEDED.md) — the coach-schedule validator now derives valid sound slugs directly from the per-item files, not the CSV.
Sources¶
docs/sound-library-research.md(clinical evidence by category; competitor scan v1)docs/research/notched-sound-therapy.md,docs/research/fractal-generative-sound-therapy.md- Competitor research (2026-06-15): EarnoiseCare (https://soundcloud.com/earnoisecare, https://tinnitushelfer.de/tinnitus-retraining-klaenge/), myNoise (https://mynoise.net), ReSound Relief, Widex Zen, Kalmeda, Oto, Calm, Headspace, Endel, Portal, BetterSleep, Noisli, Dark Noise.
- YouTube demand research (2026-06-15): ranked soothing-sound view data (https://hrnews.co.uk/the-worlds-most-preferred-soothing-sounds-ranked/), Relaxing White Noise, the brown-noise trend, myNoise popularity statements, rain-variant + fan + train niches.
- Generative/sourcing research (2026-06-15): ElevenLabs Sound Effects API + ToS (https://elevenlabs.io/docs/api-reference/text-to-sound-effects/convert, https://elevenlabs.io/terms-of-use), Stable Audio, Meta AudioCraft (CC-BY-NC).