For agent builders

Let agents create speech without broad account access

TextToSpeechSkills gives LLM agents a narrow, reviewable speech workflow: validate markup, choose an approved voice template, check credit use, create a job, and return an audio URL.

Who is this for?

TextToSpeechSkills gives AI agents a safe way to create voice output without broad account access. The MCP server exposes focused tools for validating expression markup, selecting approved voice templates, checking credit use, creating speech jobs, and returning audio URLs. This keeps agent behavior easier to review than raw HTTP calls. Teams can start with a no-code LLM app setup, then use the same API from production systems when voice output becomes part of customer-facing workflows.

Easy LLM setup

LLM-ready even for non-technical teams

Install the MCP server, add a scoped key, and tell your LLM app which template names it may use. No custom integration is needed for first tests.

Read setup guide
01Create a scoped key
02Install MCP
03Choose a voice template
04Generate audio from chat

Tool calls you can review

Agents use clear speech tools instead of improvising HTTP calls or handling hidden settings.

Templates protect quality

Approved voice templates keep agent output consistent across users and workflows.

Usage stays visible

Credit previews, job states, and workspace billing make automated audio easier to manage.

When this helps

Agent builders, automation teams, and AI product teams usually need a repeatable path for writing, review, generation, billing, and reuse. The most important jobs here are tool calls you can review, templates protect quality, usage stays visible. Those are the moments where voice becomes part of real work instead of a one-off export.

How the workflow works

Start with readable text, add expression tags when tone matters, choose an approved voice template, and create a speech job through the UI, API, or MCP. The same pattern works for AI agent voice output, LLM text-to-speech tools, MCP speech generation, which makes it easier for humans and LLM apps to share one process without exposing internal routing or credentials.

Before you roll it out

Decide which templates are approved, which expression tags are allowed, who can create workspace keys, and which usage limits are acceptable. Those choices keep automated voice generation useful without letting it sprawl from the first paid Test plan through Pro, Scale, and Business usage.

Common questions

What teams usually ask before starting

These are the practical details that matter before a team adds speech generation to a real workflow.

Who should use Voice Output for AI Agents?

Agent builders, automation teams, and AI product teams should use this page when they want generated speech that is easy to review, consistent across prompts, and simple to connect to LLM tools. The core workflow combines expression tags, voice templates, credit previews, and job-based generation.

Can a non-technical user connect this to an LLM app?

Install the MCP server, add a scoped key, and tell your LLM app which template names it may use. No custom integration is needed for first tests. The setup guide keeps the first path short while still giving developers a clean API when the workflow moves into a product backend.

How does pricing stay predictable?

Every paid plan uses credits. Teams can add credit packs when needed, and workspaces on Pro and higher add central billing for $2 per user per month.

API playground

Plain JSON in, speech job out

{
  "text": "[quiet] hello. [loud and angry] how are you?",
  "voice_template": "vt_calm_narrator_v1",
  "generation_mode": "instant",
  "format": "mp3"
}
202 queued for polling200 audio ready

MCP install

Agent tools included at launch

Claude Desktoppnpm --package texttospeechskills dlx tts-skills-mcp
Codexpnpm --package texttospeechskills dlx tts-skills-mcp
Cursorpnpm --package texttospeechskills dlx tts-skills-mcp
Skills helperpnpm --package texttospeechskills dlx tts-skills tags