# Alphabet Cast — Two Tiers, Letter-forms + Citizens

A character per Latin letter — but not one cast, two. Recognized after a course-correction in conversation: the original work produced humanoids with letter-themed accessories, when what was wanted was figures whose body shape IS the letter. Both casts now exist:

**Tier 1 — Letter-forms.** Characters whose body IS the letter glyph. The letter A is an A-shaped being with eyes and a mouth on the glyph itself. Each letter is composed of exactly four reducible primitives: **vertical, horizontal, diagonal, curve** (`V/H/D/C`). All primitives share the same stroke thickness so curves anchor to straights at exact shared endpoints. This is the cosmetic instance of the SporeOS letters-as-runtime substrate; when user supplies operative letter operators, the same primitive set becomes the visual realization of the fold-parameter combinators.

- Code: `tv/sprites/mint_letterform.py`
- YAMLs: `tv/characters/letterforms/{a..z}.yaml`
- Assets: `tv/assets/letterforms/{letter}/letter_{letter}_form{,_mouth_*}.png`

**Tier 2 — Citizens of the Letters.** 26 humanoid characters associated with each letter's domain — The Apex Scribe (A), The Bell Founder (B), The Cradle Watcher (C), and so on. Useful as ensemble extras, side-cast in letter-domain scenes, named characters with backstories. Rendered via the procedural humanoid mint.

- Code: `tv/sprites/generate_alphabet_cast.py` + `tv/sprites/mint_alphabet.py`
- YAMLs: `tv/characters/alphabet/{a..z}.yaml`
- Assets: `tv/assets/alphabet/{letter}/letter_{letter}{,_mouth_*}.png`
- Roster: `tv/characters/alphabet/README.md`

Both casts: 2D pixel-art shipped, 3D realization (blender_headless) and voice profiles + bridge libraries deferred to Box A overnight compute. Both pull from the institute palette for visual cohesion.

This document records the scope, the architectural fit, the dependency on the letter operative-definitions, and the phased build plan.

---

## Why this is the right move

**1. The architecture is built for it.** The procedural humanoid mint (`tv/sprites/mint_humanoid.py`) reads a character YAML and emits the full sprite set in ~1 second at $0.00 cost. Twenty-six characters costs the same as four. Adding them is a YAML-authoring task, not an engineering task.

**2. It tests the Track-A Blender backend at scale.** Today there is one mushroom_creature 3D shape and one humanoid 3D MVP. Twenty-six 3D realizations exercise the backend across 26 distinct silhouette/proportion/palette combinations and shake out edge cases the small cast can't.

**3. It compounds the voice work.** Each letter has a phonetic identity (A / B / C voice their own initial). The voice fold engine has 32 parameters per character. Twenty-six × 32 = 832 voice-parameter slots searched against reference. That's a lot of fold-engine surface for the optimizer to map.

**4. It unblocks letter-personification content directly.** "Letters argue about grammar," "the alphabet does a top-10 list," "M reviews a Birkin bag" — entire video types open up that the four-reporter cast doesn't reach.

**5. It seeds the SporeOS letters-as-runtime work.** Per `project_sporeos_letters_as_runtime.md`, the SporeOS thesis is that letters are bytecode at every layer. Today that work is blocked on user-provided operative definitions of the 26 Latin letters as fold-parameter combinators. The alphabet cast is the visual/audible *instance* of those combinators — once the operative definitions land, each letter-character's parameters become canonically derived rather than hand-authored.

---

## Architectural fit

### YAMLs live here

```
tv/characters/
  alphabet/
    a.yaml
    b.yaml
    ...
    z.yaml
```

Each YAML follows the existing humanoid template (`engine/templates/humanoid.yaml`) — identity, silhouette, palette, voice 32-param block, sprite variants list. No new schema needed.

### 2D pipeline (pixel_pillow, shipped today)

Already works for any humanoid YAML:

```bash
python -m tv.sprites.mint_humanoid tv/characters/alphabet/a.yaml \
    --out tv/assets/alphabet/a/
```

Emits `a_mouth_closed.png`, `a_mouth_half.png`, `a_mouth_wide.png`, `a_mouth_oh.png`, `a_mouth_ee.png` + walk-cycle variants. Same convention as the four shipped humanoids.

**Estimated time per letter:** 2 seconds wall-clock. Whole alphabet: ~1 minute.

### 3D pipeline (blender_headless, MVP)

The Blender backend (`backends/blender_headless/`) reads the same YAML and renders 3D. Per render_registry.yaml it's at MVP — 7-head proportions, segmented limbs, face geometry, per-shape mouth swap, linear color. Twenty-six characters is the right stress test.

**Estimated time per letter (3D):** ~30 seconds for a portrait turntable on CPU, <5 seconds on GPU. Whole alphabet: ~15 min CPU, ~2 min GPU.

### Voice pipeline (Phase 2 voice stack)

Each letter gets:
- 16 timbre params (f0, formants, breathiness, etc.)
- 16 intonation params (declination, emphatic-rise, etc.)
- A library of phonetic bridges built from a per-letter sample script

Library construction at the rate Box A handled the four shipped characters: ~30 min per character × 26 = ~13 hours of compute. Run overnight.

---

## The dependency — letter operative definitions

The "real" version of this work is blocked on the same artifact that blocks SporeOS letters-as-runtime: **operative definitions of the 26 Latin letters as fold-parameter combinators**. The user has flagged twice (per memory) that this artifact must come from the user, not be Claude-generated.

There are two ways to ship without it:

**Path 1 — cosmetic alphabet now, semantic alphabet later.** Author 26 YAMLs with hand-tuned distinguishing parameters (A is tall and angular, B is rounded and short, etc.) using mnemonic visual associations. Sprites and 3D ship. Voice profiles ship. When the operative definitions land later, the parameter tables get rewritten and all 26 characters re-mint deterministically — no asset re-export pipeline needed because the YAMLs are the source of truth.

**Path 2 — wait for operative definitions.** Block until the user supplies the 26 letter operators. Ship 26 characters that are correctly derived from day one. Slower but more grounded.

**Recommendation: Path 1.** Cosmetic-first lets the studio test the 26-cast scaling behavior, exercises the 3D backend, gives the voice optimizer 832 parameter slots to search, and produces visible content that drives the operative-definition work forward. The semantic upgrade is a YAML rewrite later — the asset pipeline doesn't need to change.

This matches the user's build-first methodology (`feedback_methodology_build_first.md`): substance conceptually first, formalize later. Cosmetic alphabet IS the substance instance.

---

## Phasing

### Phase 1 — YAML cast scaffold (1 day)

- Create `tv/characters/alphabet/{a..z}.yaml`
- Each gets a hand-tuned distinguishing parameter block (silhouette, palette, voice f0/range)
- A short identity sentence per letter (lore hook)
- Mnemonic visual choices: A = pointed/tall, B = rounded/double-bodied, C = curved/open, D = solid/squared, etc.

### Phase 2 — 2D pixel-art mint (½ day)

- Run `mint_humanoid` across all 26 YAMLs
- Output to `tv/assets/alphabet/{letter}/`
- Visual audit: are 26 distinguishable as alphabet glyphs? If too many collide, retune YAMLs.
- Smoke test: render all 26 in one panel image (`tv/output/alphabet_panel.png`) for instant cohesion check

### Phase 3 — 3D Blender mint (1 day)

- Run blender_headless against all 26 YAMLs
- Output to `tv/assets/alphabet_3d/{letter}/` (turnaround + portrait + walk-loop)
- Cohesion check: panel render for 3D consistency

### Phase 4 — Voice profile + bridge libraries (~13 hrs compute, overnight)

- Build per-letter voice profile (32 params from YAML)
- Sample script: each letter speaks a sentence containing words starting with that letter
- Run the JIT phonetic bridge engine to build per-letter library
- Output to `store/cold/phonetic_bridges/alphabet/{letter}/`

### Phase 5 — Integration into video types (1 day)

- Update `tv/studio/brief_to_project.py` cast-selection logic to allow `cast: alphabet` (auto-rotate through 26)
- Update `engine/parameter_vocab.yaml` with `humanoid(letter_a)` … `humanoid(letter_z)` slot list refinements
- Add a demo project: `letters_argue_about_grammar.project.yaml` — 26-character ensemble cameo
- Render in clean + brainrot variants

### Phase 6 — Semantic upgrade (BLOCKED — awaiting user operative definitions)

When user provides the 26 letter operators:
- Replace hand-tuned YAML blocks with derived blocks (parameters computed from each letter's fold-operator signature)
- Re-mint sprites + 3D + voice profiles deterministically
- Validate that the new assets fold-converge under each letter's own central-tree attractor
- This is where the alphabet cast STOPS being cosmetic and starts being a semantic instance of SporeOS letters-as-runtime

---

## Integration with the video template matrix

Adds two things to [docs/video_template_matrix.md](video_template_matrix.md):

**1. New cast option** for every video type:
```yaml
cast: alphabet                    # all 26
# or
cast: [letter_a, letter_e, letter_m, letter_z]   # subset
```

**2. New native video types that only the alphabet enables:**

| Preset | What it is | Platform fit |
|---|---|---|
| `letters_argue` | 26-character ensemble debate sketches | TikTok, Reels |
| `alphabet_top_n` | "Top 10 letters" tier-list humor | Shorts, TikTok |
| `letter_review` | "M reviews a Birkin bag" character-as-reviewer | Reels, TikTok |
| `phonics_explainer` | educational alphabet content | YouTube Kids, Shorts |
| `letter_brainrot` | one letter per short, full chaos register | TikTok, Shorts |

These are revenue-targeted: phonics-explainer alone is a viable Gumroad product (educational pack). Letter-brainrot is uniquely defensible — Sora can't produce 26 character-persistent letter shorts without drift.

---

## Scope summary

| Item | Effort | Output |
|---|---|---|
| 26 YAMLs | 1 day | `tv/characters/alphabet/*.yaml` |
| 2D mint (all 26) | ½ day | 26 × 5 mouth + walk-cycle PNGs |
| 3D mint (all 26) | 1 day | 26 × turntable + portrait + walk-loop |
| Voice + bridges | overnight compute | per-letter voice profile + bridge library |
| Integration | 1 day | brief_to_project + 5 demo projects |
| **Total Phase 1–5** | **~3.5 working days + 1 overnight** | **Full alphabet cast, 2D + 3D + voice** |
| Phase 6 (semantic) | BLOCKED | Awaiting user letter operative definitions |

---

## Open decisions

1. **Path 1 (cosmetic now) vs Path 2 (wait for operative definitions)?** Default: Path 1. Letting user override.
2. **Naming convention.** `letter_a` vs `alphabet_a` vs `a` for the YAML id and asset folder. Default: `letter_a` (clearer when grepped, matches `humanoid(letter_a)` slot syntax in parameter_vocab).
3. **Should each letter have its own template, or all reuse `humanoid`?** Default: reuse humanoid — 26 separate templates would be over-abstraction for cosmetic phase. Phase 6 may justify a `letter` template inheriting humanoid + adding `letter_operator` field.
4. **3D first or 2D first?** Default: 2D first because pixel_pillow is shipped and 3D is at MVP — 2D produces the panel-image cohesion check fastest, which de-risks the rest.

---

*Recorded 2026-04-25. Captures user request: "I wanted to add characters for each letter of the alphabet. And have those be 2D and 3D rendered as part of the videos as they're being made."*
