Deepgram MCP Server for Claude
Transcribe audio, synthesize speech, and run audio intelligence directly from Claude with the official Deepgram MCP server — dynamic tool discovery fetches new capabilities from Deepgram's API at runtime without requiring package upgrades.
Open the source and read safety notes before installing.
Safety notes
- Audio files and transcription payloads are sent to Deepgram's cloud API for processing — do not transcribe audio containing highly sensitive PII without reviewing Deepgram's data retention policies.
- Text-to-speech outputs are returned as audio data via the API; no files are written to disk unless you explicitly save them.
Privacy notes
- Audio content (speech, recordings) is transmitted to Deepgram's servers for transcription and synthesis — review Deepgram's privacy policy for data handling and retention.
- Your `DEEPGRAM_API_KEY` is passed as an environment variable — treat it as a secret.
Prerequisites
- A Deepgram API key (free tier at console.deepgram.com).
- Python with `pip` available: `pip install deepgram-mcp` to install the package.
- An MCP client such as Claude Code or Claude Desktop.
Schema details
- Install type
- cli
- Troubleshooting
- No
- Scope
- Source repo
- Estimated setup
- 5 minutes
- Difficulty
- beginner
- Website
- https://deepgram.com
- Disclosure
- Deepgram is a commercial speech AI provider. The MCP server is officially maintained by Deepgram.
Full copyable content
{
"mcpServers": {
"deepgram": {
"command": "deepgram-mcp",
"env": {
"DEEPGRAM_API_KEY": "your-api-key"
}
}
}
}About this resource
Overview
The Deepgram MCP Server is the official Model Context Protocol server from Deepgram, the AI speech API company. It gives Claude access to Deepgram's speech-to-text transcription, text-to-speech synthesis, and audio intelligence capabilities. A notable architectural difference: the tool list is fetched from Deepgram's API at runtime — new capabilities appear automatically as Deepgram releases them, without requiring a package upgrade. Licensed under MIT.
Key capabilities
- Speech-to-text — transcribe audio from URLs or uploaded files using Deepgram's Nova and Whisper-based models; supports 30+ languages.
- Text-to-speech — synthesize natural speech from text using Deepgram's Aura voice library.
- Audio intelligence — run summarization, topic detection, sentiment analysis, and intent recognition on audio content.
- Dynamic tool discovery — tools are fetched from Deepgram's API at startup, so the server always exposes the latest capabilities without package upgrades.
How it compares
| Server | STT | TTS | Audio intelligence | Dynamic tools | Auth |
|---|---|---|---|---|---|
| Deepgram MCP | Yes | Yes | Yes | Yes | API key |
| Groq MCP | Yes (Whisper) | Yes | No | No | API key |
| OpenAI MCP | Yes (Whisper) | Yes | No | No | API key |
| AssemblyAI MCP | Yes | No | Yes | No | API key |
Deepgram's dynamic tool list means new API features are immediately available in Claude without server restarts or package updates — a capability unique among audio MCP servers.
Installation
Install the package
pip install deepgram-mcp
Claude Code
claude mcp add deepgram -e DEEPGRAM_API_KEY=your-api-key -- deepgram-mcp
Claude Desktop
{
"mcpServers": {
"deepgram": {
"command": "deepgram-mcp",
"env": {
"DEEPGRAM_API_KEY": "your-api-key"
}
}
}
}
SSE / HTTP mode
deepgram-mcp --transport sse --port 8000
Requirements
- A Deepgram API key (free tier at console.deepgram.com).
- Python with
pip(pip install deepgram-mcp). - An MCP client (Claude Code or Claude Desktop).
Security
- API key authentication — generate a project-scoped key from the Deepgram console.
- Audio is processed server-side by Deepgram; no audio files are stored locally by default.
Source Verification Notes
Verified on 2026-06-18:
- Official GitHub repository
deepgram/mcp(MIT) documents thedeepgram-mcppip package,DEEPGRAM_API_KEYconfiguration, Claude Code install command, dynamic tool discovery from Deepgram's API, the--transport sseHTTP mode, and the Deepgram CLI integration. - Deepgram documentation at
docs.deepgram.com/docs/mcp(HTTP 200) covers the MCP server setup, supported models, and audio intelligence capabilities. - Claude Code MCP documentation at
code.claude.com/docs/en/mcpdescribes the stdio connector pattern used above.
Source citations
Add this badge to your README
How it compares
Deepgram MCP Server for Claude side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.
| Field | Deepgram MCP Server for Claude Transcribe audio, synthesize speech, and run audio intelligence directly from Claude with the official Deepgram MCP server — dynamic tool discovery fetches new capabilities from Deepgram's API at runtime without requiring package upgrades. Open dossier | ElevenLabs MCP Server Official ElevenLabs MCP server for generating speech, designing voices, cloning voices, transcribing audio, creating sound effects, and working with conversational audio agents through the ElevenLabs API. Open dossier | Groq MCP Server for Claude Query Groq's ultra-fast inference models from Claude — vision, text-to-speech, speech-to-text, batch processing, and agentic compound-beta tools with web search and code execution — using the official Groq Model Context Protocol server. Open dossier | FunASR MCP Server MCP server example from FunASR that lets Claude transcribe local audio files with local speech recognition, automatic language handling, timestamps, and speaker labels when available. Open dossier |
|---|---|---|---|---|
| Trust | ||||
| Install risk | Review first | Review first | Review first | Review first |
| Notes | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ |
| Category | mcp | mcp | mcp | mcp |
| Source | source-backed | source-backed | source-backed | source-backed |
| Author | Deepgram | ElevenLabs | Groq | FunASR |
| Added | 2026-06-18 | 2026-06-06 | 2026-06-18 | 2026-06-06 |
| Platforms | Claude CodeCodexCursorClaude Desktop | Claude CodeClaude Desktop | Claude CodeClaude Desktop | Claude CodeClaude Desktop |
| Source repo | — | — | — | — |
| Safety notes | ✓Audio files and transcription payloads are sent to Deepgram's cloud API for processing — do not transcribe audio containing highly sensitive PII without reviewing Deepgram's data retention policies. Text-to-speech outputs are returned as audio data via the API; no files are written to disk unless you explicitly save them. | ✓ElevenLabs MCP Server can call paid ElevenLabs API endpoints; text-to-speech, voice design, voice cloning, audio isolation, transcription, sound generation, music, and agent workflows can consume account credits. Voice cloning and voice conversion can create realistic synthetic speech, so require documented consent and review before processing a person's voice or publishing generated audio. Generated speech, sound effects, music, transcripts, and conversation-agent configuration can affect public-facing content; review prompts, voice IDs, output format, language, and destination before publishing or sending. File output mode writes generated files to disk under the configured base path; restrict that path to an approved directory and avoid broad home, desktop, or shared folders in production. Use separate API keys or workspaces for test and production clients, monitor credit usage, and disable tools in clients that should not spend credits. Some operations may take longer than normal MCP tool timeouts; do not retry expensive generation calls blindly. | ✓The `compound-beta` tools include code execution and live web search — code runs in Groq's sandboxed environment but web requests are made to external URLs. Text-to-speech and speech-to-text outputs are saved to `BASE_OUTPUT_PATH` (default: ~/Desktop) — ensure this path has appropriate access controls. | ✓The MCP server exposes a `transcribe_audio` tool that reads the local file path supplied by the agent. Configure clients so Claude can only request audio files from approved directories; do not expose arbitrary private folders or shared drives. First use can download FunASR model weights and dependencies from upstream model hosts; review network policy, cache location, and disk usage before use in restricted environments. Long recordings and GPU transcription can consume significant CPU, GPU, memory, and disk cache resources. Require confirmation before transcribing meetings, calls, interviews, voice notes, customer audio, regulated recordings, or files containing other people. |
| Privacy notes | ✓Audio content (speech, recordings) is transmitted to Deepgram's servers for transcription and synthesis — review Deepgram's privacy policy for data handling and retention. Your `DEEPGRAM_API_KEY` is passed as an environment variable — treat it as a secret. | ✓The MCP client can expose ElevenLabs API keys, voice IDs, text prompts, voice descriptions, uploaded audio samples, generated audio paths, transcripts, diarized speaker labels, and conversational-agent settings. Uploaded audio and generated outputs may contain biometric voice characteristics, names, background sounds, private conversations, or copyrighted material. File, resource, and both output modes can retain generated audio locally, in MCP resources, in logs, or in chat transcripts depending on the client. Treat voice samples and transcripts as sensitive data, and delete generated files or cached resources when they are no longer needed. Review ElevenLabs account, retention, residency, and enterprise data-residency settings before using the server with regulated or customer data. | ✓Text, images, and audio passed to Groq tools are sent to Groq's API for inference — do not pass sensitive or personally identifiable data. Your `GROQ_API_KEY` is passed as an environment variable — treat it as a secret and store it securely. | ✓Audio recordings can contain voices, names, accents, speaker identity, background speech, locations, health details, financial details, customer data, credentials spoken aloud, or other sensitive personal information. The upstream MCP example performs local inference and does not require an API key, but MCP clients, model providers, logs, terminal output, transcripts, screenshots, and shared chats can still retain audio paths and transcription text. Generated transcripts, timestamps, and speaker labels may identify individuals or reveal confidential conversations. Model downloads and package installation can contact PyPI, ModelScope, Hugging Face, or other dependency hosts depending on the environment and model configuration. |
| Prerequisites |
|
|
|
|
| Install | | | | |
| Config | | | | |
| Citations | ||||
| Claim | Unclaimed | Unclaimed | Unclaimed | Unclaimed |
Signals
Loading live community signals…
A short, calm digest of reviewed Claude resources. Unsubscribe any time.