Open source · for builders + agents

The pipeline
as small parts.

Every step of the network ships in the repo. Use them stand-alone, fork them, embed them in your own product. The website is one way to consume the rails; this is the other.

open source · audit-friendly · runs locally on Apple Silicon

Product · voice clone

Your NotebookLM video, in your own voice

Send a 30s recording over chat — keep the video, swap the voice. First job free.

Open guide →

01 · python

publish.py

Flow A — the website pipeline: download → transcribe → diarize → translate → neural TTS → publish to the catalog.

github ↗

$ python scripts/pipeline/publish.py <youtube_url>

02 · python

publish_clone.py

Flow B / Plan B — cross-lingual voice clone. Each speaker's own voice, speaking Mandarin, via Qwen3-TTS in-context cloning.

github ↗

$ python scripts/pipeline/publish_clone.py <url>

03 · python

publish_clone_vc.py

Flow C / Plan C — clean native TTS + seed-vc timbre transfer. Clear, standard pronunciation in the speaker's voice; the hard multi-speaker case.

github ↗

$ python scripts/pipeline/publish_clone_vc.py <vid> --audio <wav>

04 · python

glossary.py

Keeps crypto / AI / web3 / startup proper nouns in English during translation (instruction + source masking + Chinese→English repair).

github ↗

$ from glossary import repair, instruction

For agents

One command, multilingual.

The pipeline is scriptable end-to-end. Point an agent at a URL and it returns a publishable, Chinese-dubbed episode with transcripts and metadata.

# Plan C — clean voice conversion publish_clone_vc.py <vid> --audio out.wav → interviews.js + clones.json updated

For protocols

Every episode is addressable.

Each episode has a stable URL — /interview/<id> — with audio, bilingual transcript, and metadata. Embed it in your own client.

GET /interview/9nnHC66MBqE → { audio, transcript, summary }

The pipelineas small parts.

Your NotebookLM video, in your own voice

publish.py

publish_clone.py

publish_clone_vc.py

glossary.py

One command, multilingual.

Every episode is addressable.

The pipeline
as small parts.