Docs
Text-to-speech docs for API, MCP, skills, and secure setup
Use the same speech workflow from the browser UI, your product backend, or an LLM app. Start with scoped keys, validated expression tags, reusable voice templates, async speech jobs, and installable MCP tools.
Use jobs for every production workflow
Speech generation should not depend on a browser waiting for one long request. The docs explain how to create jobs, poll for status, receive audio URLs, and keep longer scripts in the background so your product UI stays responsive while the server handles billing, storage, retries, and delivery.
Keep LLM access narrow
The MCP and skills setup gives LLM apps focused tools for validating markup, listing approved templates, previewing credit use, creating jobs, and returning audio. That is enough for useful automation without giving a chat session broad account access or hidden credentials.
Make templates the stable contract
Instead of repeating subjective voice instructions in every API call, your app sends text plus a template ID. The docs cover how templates should be named, versioned, approved, and shared across workspaces so narrators, characters, support voices, and course instructors stay recognizable.
Ship with server-side safety
API keys, OAuth, payment state, private audio storage, service routing, and usage ledger updates belong on the backend. The public UI only needs scoped actions and safe configuration, which keeps setup easier for users and reduces the chance of accidental secret exposure.