No public engine choices to manage
Your app sends the outcome it needs. Routing stays server-side so customer-facing screens stay simple.
Built for shipping teams
TextToSpeechSkills focuses on the parts developers repeatedly ask for after the demo works: clean API calls, predictable jobs, scoped keys, workspace billing, reusable templates, and agent integrations that can be installed quickly.
TextToSpeechSkills is a developer-first text-to-speech platform for teams that want clean API access, expression markup, reusable voice templates, MCP tools, installable skills, async jobs, workspace billing, and visible credit usage. The product is designed around outcomes instead of public engine selection. A team can create speech in the UI, call the API from a product, or connect an LLM app through MCP. This makes it a practical option for builders who want voice output that is easier to automate, review, and budget.
Easy LLM setup
Non-technical users can start from the guide: create a key, copy the MCP command, choose a voice template, and ask their LLM app to generate audio.
Read setup guideYour app sends the outcome it needs. Routing stays server-side so customer-facing screens stay simple.
Credits, optional packs, and workspace billing make speech generation easier to forecast as usage grows.
Human builders and LLM agents can share the same validated speech workflow from launch.
Builders comparing developer-first text-to-speech platforms usually need a repeatable path for writing, review, generation, billing, and reuse. The most important jobs here are no public engine choices to manage, billing that matches usage, api, mcp, and skills together. Those are the moments where voice becomes part of real work instead of a one-off export.
Start with readable text, add expression tags when tone matters, choose an approved voice template, and create a speech job through the UI, API, or MCP. The same pattern works for text-to-speech alternative, developer TTS, voice API alternative, which makes it easier for humans and LLM apps to share one process without exposing internal routing or credentials.
Decide which templates are approved, which expression tags are allowed, who can create workspace keys, and which usage limits are acceptable. Those choices keep automated voice generation useful without letting it sprawl from the first paid Test plan through Pro, Scale, and Business usage.
Common questions
These are the practical details that matter before a team adds speech generation to a real workflow.
Builders comparing developer-first text-to-speech platforms should use this page when they want generated speech that is easy to review, consistent across prompts, and simple to connect to LLM tools. The core workflow combines expression tags, voice templates, credit previews, and job-based generation.
Non-technical users can start from the guide: create a key, copy the MCP command, choose a voice template, and ask their LLM app to generate audio. The setup guide keeps the first path short while still giving developers a clean API when the workflow moves into a product backend.
Every paid plan uses credits. Teams can add credit packs when needed, and workspaces on Pro and higher add central billing for $2 per user per month.
API playground
{
"text": "[quiet] hello. [loud and angry] how are you?",
"voice_template": "vt_calm_narrator_v1",
"generation_mode": "instant",
"format": "mp3"
}MCP install
pnpm --package texttospeechskills dlx tts-skills-mcppnpm --package texttospeechskills dlx tts-skills-mcppnpm --package texttospeechskills dlx tts-skills-mcppnpm --package texttospeechskills dlx tts-skills tags