FunASR MCP Server

MCP server example from FunASR that lets Claude transcribe local audio files with local speech recognition, automatic language handling, timestamps, and speaker labels when available.

by FunASR · submitted by oktofeesh1·added 2026-06-06·

Claude Code Claude Desktop

HarnessClaude Code Claude Desktop

Command center

Source

Review first

Review safety and privacy notes before installing or copying commands.

Safety notes Privacy notes

Install & copy

pip install funasr

Trust & readiness

TrustReview first
Sourcesource-backed
Safety notesPresent
ReviewedNo

Community context

Related entries(4)
Related guides(1)
Community signals

Compare

Integrations & API

Contribute

Suggest a metadata change Claim this listing

Documentation Source repository Browse directory

Review first — review before installing

Open the source and read safety notes before installing.

Citation facts

Source-backed facts for citing this resource, derived directly from the registry — also available as plain text for AI assistants.

Canonical URL: https://heyclau.de/entry/mcp/funasr-mcp-server
Source URLs: https://raw.githubusercontent.com/modelscope/FunASR/main/examples/mcp_server/README.md, https://github.com/modelscope/FunASR
Brand: FunASR
Brand domain: modelscope.github.io
Brand asset source: brandfetch
Safety notes: The MCP server exposes a `transcribe_audio` tool that reads the local file path supplied by the agent., Configure clients so Claude can only request audio files from approved directories; do not expose arbitrary private folders or shared drives., First use can download FunASR model weights and dependencies from upstream model hosts; review network policy, cache location, and disk usage before use in restricted environments., Long recordings and GPU transcription can consume significant CPU, GPU, memory, and disk cache resources., Require confirmation before transcribing meetings, calls, interviews, voice notes, customer audio, regulated recordings, or files containing other people.
Privacy notes: Audio recordings can contain voices, names, accents, speaker identity, background speech, locations, health details, financial details, customer data, credentials spoken aloud, or other sensitive personal information., The upstream MCP example performs local inference and does not require an API key, but MCP clients, model providers, logs, terminal output, transcripts, screenshots, and shared chats can still retain audio paths and transcription text., Generated transcripts, timestamps, and speaker labels may identify individuals or reveal confidential conversations., Model downloads and package installation can contact PyPI, ModelScope, Hugging Face, or other dependency hosts depending on the environment and model configuration.
Author: FunASR
Submitted by: oktofeesh1
Claim status: unclaimed
Last verified: 2026-06-06

Decision playbook

Review trust signals before you adopt

Signals are present but mixed. Use the checklist below to confirm the source and operational safety for your environment.

Compare context

Selected

Current score

Baseline

—

Delta

No baseline selected

No major trust-signal divergence detected in the current selection.

Source and provenance checks

Needs review

Confirm ownership and provenance before trusting install instructions.

Source link availableRequired
Open the canonical repository and verify ownership.
Done
Source provenance statusRequired
Marked as source-backed.
Done
Metadata reviewed
No reviewed flag detected in metadata.
Pending

Safety and privacy checks

Complete

Validate risk disclosures before installation or API wiring.

Safety notes presentRequired
Review the listed safety guidance before running commands.
Done
Privacy notes presentRequired
Review data handling notes before connecting accounts or secrets.
Done
Trust level risk gateRequired
Trust level does not block evaluation.
Done

Package and install checks

Needs review

Check package metadata and artifact integrity signals.

Install payload available
Install or copy payload is available for review.
Done
Package verification flag
No package verification flag provided.
Pending
Checksum metadata
No checksum provided for downloaded artifact.
Pending

Compare-driven decision checks

Needs review

Use compare context to validate trade-offs before adoption.

Compare tray has multiple entries
Add at least one more entry to compare trust differences.
Pending
Baseline comparison available
No baseline peer selected yet.
Pending
Diverging trust signals identified
No major trust-signal divergence found.
Pending

Setup at a glance

CLI install

Copy-ready — paste the snippet to get started.

20 minutes

Install command

Provided

Config snippet

Provided

Copy snippet

Provided

Prerequisites

5 to clear

Platforms

2 listed

Install type

CLI install

Adoption plan

Balanced adoption plan

Current risk score 24/100. Use staged verification before broader rollout.

Risk 24

Pre-adoption checks

Validate source and review signals before any execution.

Confirm source provenanceRequired
Source URL/provenance metadata is present.
Done
Confirm metadata review state
No review metadata found; increase manual validation.
Pending
Verify install payload
Install/config payload exists and can be inspected.
Done

Security checks

Confirm safety, privacy, and package integrity signals.

Review safety notesRequired
Safety notes are present.
Done
Review privacy notesRequired
Privacy notes are present.
Done
Verify package integrity metadata
No package verification/checksum metadata.
Pending

Rollout

Adopt in controlled steps based on the selected plan.

Run in isolated sandbox firstRequired
Use a constrained sandbox and observe behavior across multiple tasks.
Pending
Roll out graduallyRequired
Roll out to a small cohort before wider usage.
Pending
Set monitoring and fallback
Define rollback path and monitor errors after adoption.
Pending

Evidence readiness

Evidence readiness matrix · balanced

Missing required evidence: Metadata review. Risk score 31.

Risk 31

Source provenance

Present

Source repository/provenance is listed.

Required in this preset

Metadata review

Missing

Review metadata is missing.

Required in this preset

Safety notes

Present

Safety notes are present.

Required in this preset

Privacy notes

Present

Privacy notes are present.

Optional in this preset

Package integrity

Missing

Package integrity metadata is missing.

Optional in this preset

Install payload

Present

Install payload is available.

Required in this preset

Required gaps: Metadata review

Decision timeline

Decision timeline · balanced

Blocking gaps: Check metadata review status. Risk 28.

Risk 28

triage

Confirm source provenanceRequired

Source/provenance metadata is available.

Done

triage

Check metadata review statusRequired

Review metadata is missing.

Pending

verify

Review safety notesRequired

Safety notes are available.

Done

verify

Review privacy notes

Privacy notes are available.

Done

verify

Validate package integrity metadata

Package integrity metadata is missing.

Pending

rollout

Verify install payload and commandsRequired

Install payload is available.

Done

Blockers: Check metadata review status

Prerequisite readiness

5 prerequisites to line up before setup. Includes a review or approval gate.

0/5 ready

Review & approval3General220 minutes

Safety & privacy surface

5 safety and 4 privacy notes across 5 risk areas. Review closely: credentials & tokens, permissions & scopes, network access.

5 areas

SafetyLocal filesThe MCP server exposes a `transcribe_audio` tool that reads the local file path supplied by the agent.
SafetyNetwork accessConfigure clients so Claude can only request audio files from approved directories; do not expose arbitrary private folders or shared drives.
SafetyNetwork accessFirst use can download FunASR model weights and dependencies from upstream model hosts; review network policy, cache location, and disk usage before use in restricted environments.
SafetyLocal filesLong recordings and GPU transcription can consume significant CPU, GPU, memory, and disk cache resources.
SafetyLocal filesRequire confirmation before transcribing meetings, calls, interviews, voice notes, customer audio, regulated recordings, or files containing other people.
PrivacyCredentials & tokensAudio recordings can contain voices, names, accents, speaker identity, background speech, locations, health details, financial details, customer data, credentials spoken aloud, or other sensitive personal information.
PrivacyCredentials & tokensThe upstream MCP example performs local inference and does not require an API key, but MCP clients, model providers, logs, terminal output, transcripts, screenshots, and shared chats can still retain audio paths and transcription text.
PrivacyExecution & processesGenerated transcripts, timestamps, and speaker labels may identify individuals or reveal confidential conversations.
PrivacyPermissions & scopesModel downloads and package installation can contact PyPI, ModelScope, Hugging Face, or other dependency hosts depending on the environment and model configuration.

Disclosure: MIT-licensed FunASR repository with an MCP server example for local speech transcription. Verify model licenses, recording consent, and data-handling requirements before using it with real meeting, call, or customer audio.

Safety notes

The MCP server exposes a `transcribe_audio` tool that reads the local file path supplied by the agent.
Configure clients so Claude can only request audio files from approved directories; do not expose arbitrary private folders or shared drives.
First use can download FunASR model weights and dependencies from upstream model hosts; review network policy, cache location, and disk usage before use in restricted environments.
Long recordings and GPU transcription can consume significant CPU, GPU, memory, and disk cache resources.
Require confirmation before transcribing meetings, calls, interviews, voice notes, customer audio, regulated recordings, or files containing other people.

Privacy notes

Audio recordings can contain voices, names, accents, speaker identity, background speech, locations, health details, financial details, customer data, credentials spoken aloud, or other sensitive personal information.
The upstream MCP example performs local inference and does not require an API key, but MCP clients, model providers, logs, terminal output, transcripts, screenshots, and shared chats can still retain audio paths and transcription text.
Generated transcripts, timestamps, and speaker labels may identify individuals or reveal confidential conversations.
Model downloads and package installation can contact PyPI, ModelScope, Hugging Face, or other dependency hosts depending on the environment and model configuration.

Prerequisites

Python environment with FunASR installed from PyPI or a reviewed source checkout.
Local checkout or copy of `examples/mcp_server/funasr_mcp.py` from the FunASR repository.
Audio files in an approved location and format such as WAV, MP3, FLAC, M4A, or OGG.
Optional GPU, Apple silicon, or CPU device selection through `FUNASR_DEVICE`.
Review of model-download behavior, storage location, compute requirements, and organization policy for processing speech recordings locally.

Schema details

Install type: cli
Troubleshooting: No

Source repository stats

Scope: Source repo

Collection metadata

Estimated setup: 20 minutes
Difficulty: intermediate

Tool listing metadata

Disclosure: MIT-licensed FunASR repository with an MCP server example for local speech transcription. Verify model licenses, recording consent, and data-handling requirements before using it with real meeting, call, or customer audio.

Full copyable content

{
  "mcpServers": {
    "funasr": {
      "command": "python",
      "args": ["<path-to-FunASR>/examples/mcp_server/funasr_mcp.py"],
      "env": {
        "FUNASR_DEVICE": "cpu"
      }
    }
  }
}

About this resource

Content

FunASR MCP Server is the MCP server example included with the FunASR speech recognition toolkit. It exposes a stdio MCP tool named transcribe_audio so Claude can transcribe approved local audio files through FunASR's local ASR models.

Use it when Claude needs to turn a meeting recording, interview, voice memo, podcast clip, or other approved local audio file into text without sending the audio to a hosted transcription API by default.

Source Review

These sources were reviewed on 2026-06-06. Prefer the live repository, MCP example README, PyPI package page, main README, license, setup metadata, MCP server script, model-selection guide, deployment matrix, and migration guide for current setup and model behavior.

Features

Expose one MCP tool, transcribe_audio, over stdio.
Accept a local audio_path for WAV, MP3, FLAC, M4A, OGG, and similar audio files supported by the FunASR stack.
Return transcription text and, when available from the model output, segment timestamps and speaker labels.
Run local inference with no service API key required by the MCP example.
Select CPU, CUDA, or Apple mps execution through FUNASR_DEVICE.
Use FunASR's speech recognition stack for multilingual ASR, VAD, punctuation, and diarization workflows.

Installation

Install FunASR in a Python environment:

pip install funasr

Then configure your MCP client to launch the example server script from a reviewed FunASR checkout:

{
  "mcpServers": {
    "funasr": {
      "command": "python",
      "args": ["<path-to-FunASR>/examples/mcp_server/funasr_mcp.py"],
      "env": {
        "FUNASR_DEVICE": "cpu"
      }
    }
  }
}

Use cuda or mps only on machines where those accelerators are approved and available.

Use Cases

Transcribe a meeting recording saved in an approved local folder.
Convert a voice memo into text before summarizing it.
Extract timestamped segments from an interview or podcast clip.
Compare local ASR output against a hosted transcription result.
Draft meeting notes while keeping the raw audio on the local machine.
Prototype speech-to-text workflows before deploying a dedicated FunASR API server.

Safety and Privacy

FunASR MCP Server reads local audio paths supplied through the MCP client. Keep the configured script and working directory scoped to approved recordings, and require explicit approval before transcribing files from Downloads, shared drives, customer folders, or private meeting archives.

Treat audio files, file paths, transcripts, timestamps, speaker labels, terminal logs, and MCP conversation history as sensitive. Local inference avoids a hosted transcription API by default, but package installation, model downloads, logs, and connected AI clients can still expose metadata or transcript content.

Duplicate Check

No modelscope/FunASR, FunASR MCP, FunASR MCP Server, funasr_mcp.py, or matching source URL entry was found in content/mcp or README.md. Existing audio, media conversion, and local AI entries do not cover FunASR's MCP transcription example.

#speech-to-text #transcription #audio #local-ai #python

Source citations

Source methodology →

Add this badge to your README

Show that FunASR MCP Server is listed on HeyClaude. Paste this Markdown into your README — it renders the badge and links back to this page.

[![Listed on HeyClaude](https://heyclau.de/badge/mcp/funasr-mcp-server.svg)](https://heyclau.de/entry/mcp/funasr-mcp-server)

How it compares

FunASR MCP Server side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.

1 trust signal differ across this comparison (Submitter).

Field	FunASR MCP Server MCP server example from FunASR that lets Claude transcribe local audio files with local speech recognition, automatic language handling, timestamps, and speaker labels when available. Open dossier	Deepgram MCP Server for Claude Transcribe audio, synthesize speech, and run audio intelligence directly from Claude with the official Deepgram MCP server — dynamic tool discovery fetches new capabilities from Deepgram's API at runtime without requiring package upgrades. Open dossier	ElevenLabs MCP Server Official ElevenLabs MCP server for generating speech, designing voices, cloning voices, transcribing audio, creating sound effects, and working with conversational audio agents through the ElevenLabs API. Open dossier	Groq MCP Server for Claude Query Groq's ultra-fast inference models from Claude — vision, text-to-speech, speech-to-text, batch processing, and agentic compound-beta tools with web search and code execution — using the official Groq Model Context Protocol server. Open dossier
Next steps	Open dossier API JSON Open LLM MCP feed Open source Newsletter Claim listing	Open dossier API JSON Open LLM MCP feed Open source Newsletter Claim listing	Open dossier API JSON Open LLM MCP feed Open source Newsletter Claim listing	Open dossier API JSON Open LLM MCP feed Open source Newsletter Claim listing
Trust
Review status	Not reviewed	Not reviewed	Not reviewed	Not reviewed
Package trust	Package not verified	Package not verified	Package not verified	Package not verified
Source provenance	Source-backed	Source-backed	Source-backed	Source-backed
SubmitterDiffers	oktofeesh1	—	oktofeesh1	—
Install risk	Review first	Review first	Review first	Review first
Notes	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓
Brand	FunASR	—	ElevenLabs MCP Server	—
Category	mcp	mcp	mcp	mcp
Source	Source-backed	Source-backed	Source-backed	Source-backed
Author	FunASR	Deepgram	ElevenLabs	Groq
Added	2026-06-06	2026-06-18	2026-06-06	2026-06-18
Platforms	Claude Code Claude Desktop	Claude Code Claude Desktop	Claude Code Claude Desktop	Claude Code Claude Desktop
Harness	Claude Code Claude Desktop	Claude Code Claude Desktop	Claude Code Claude Desktop	Claude Code Claude Desktop
Source repo	—	—	—	—
Safety notes	✓The MCP server exposes a `transcribe_audio` tool that reads the local file path supplied by the agent. Configure clients so Claude can only request audio files from approved directories; do not expose arbitrary private folders or shared drives. First use can download FunASR model weights and dependencies from upstream model hosts; review network policy, cache location, and disk usage before use in restricted environments. Long recordings and GPU transcription can consume significant CPU, GPU, memory, and disk cache resources. Require confirmation before transcribing meetings, calls, interviews, voice notes, customer audio, regulated recordings, or files containing other people.	✓Audio files and transcription payloads are sent to Deepgram's cloud API for processing — do not transcribe audio containing highly sensitive PII without reviewing Deepgram's data retention policies. Text-to-speech outputs are returned as audio data via the API; no files are written to disk unless you explicitly save them.	✓ElevenLabs MCP Server can call paid ElevenLabs API endpoints; text-to-speech, voice design, voice cloning, audio isolation, transcription, sound generation, music, and agent workflows can consume account credits. Voice cloning and voice conversion can create realistic synthetic speech, so require documented consent and review before processing a person's voice or publishing generated audio. Generated speech, sound effects, music, transcripts, and conversation-agent configuration can affect public-facing content; review prompts, voice IDs, output format, language, and destination before publishing or sending. File output mode writes generated files to disk under the configured base path; restrict that path to an approved directory and avoid broad home, desktop, or shared folders in production. Use separate API keys or workspaces for test and production clients, monitor credit usage, and disable tools in clients that should not spend credits. Some operations may take longer than normal MCP tool timeouts; do not retry expensive generation calls blindly.	✓The `compound-beta` tools include code execution and live web search — code runs in Groq's sandboxed environment but web requests are made to external URLs. Text-to-speech and speech-to-text outputs are saved to `BASE_OUTPUT_PATH` (default: ~/Desktop) — ensure this path has appropriate access controls.
Privacy notes	✓Audio recordings can contain voices, names, accents, speaker identity, background speech, locations, health details, financial details, customer data, credentials spoken aloud, or other sensitive personal information. The upstream MCP example performs local inference and does not require an API key, but MCP clients, model providers, logs, terminal output, transcripts, screenshots, and shared chats can still retain audio paths and transcription text. Generated transcripts, timestamps, and speaker labels may identify individuals or reveal confidential conversations. Model downloads and package installation can contact PyPI, ModelScope, Hugging Face, or other dependency hosts depending on the environment and model configuration.	✓Audio content (speech, recordings) is transmitted to Deepgram's servers for transcription and synthesis — review Deepgram's privacy policy for data handling and retention. Your `DEEPGRAM_API_KEY` is passed as an environment variable — treat it as a secret.	✓The MCP client can expose ElevenLabs API keys, voice IDs, text prompts, voice descriptions, uploaded audio samples, generated audio paths, transcripts, diarized speaker labels, and conversational-agent settings. Uploaded audio and generated outputs may contain biometric voice characteristics, names, background sounds, private conversations, or copyrighted material. File, resource, and both output modes can retain generated audio locally, in MCP resources, in logs, or in chat transcripts depending on the client. Treat voice samples and transcripts as sensitive data, and delete generated files or cached resources when they are no longer needed. Review ElevenLabs account, retention, residency, and enterprise data-residency settings before using the server with regulated or customer data.	✓Text, images, and audio passed to Groq tools are sent to Groq's API for inference — do not pass sensitive or personally identifiable data. Your `GROQ_API_KEY` is a secret — store it only in your MCP client configuration or a protected environment file, not in shell history or command-line arguments.
Prerequisites	Python environment with FunASR installed from PyPI or a reviewed source checkout. Local checkout or copy of `examples/mcp_server/funasr_mcp.py` from the FunASR repository. Audio files in an approved location and format such as WAV, MP3, FLAC, M4A, or OGG. Optional GPU, Apple silicon, or CPU device selection through `FUNASR_DEVICE`.	A Deepgram API key (free tier at console.deepgram.com). Python with `pip` available: `pip install deepgram-mcp` to install the package. An MCP client such as Claude Code or Claude Desktop.	Python 3.11 or newer with `uvx` available. An ElevenLabs API key for the account and workspace you intend Claude to use. Review of ElevenLabs pricing, credits, voice-cloning policy, content rules, and data handling before enabling tools that generate or process audio. An approved output directory when using file-based generated audio output.	A Groq API key (free at console.groq.com). Python with `uv` installed: `pip install uv` or `brew install uv`. An MCP client such as Claude Code or Claude Desktop.
Install	`pip install funasr`	`claude mcp add deepgram -e DEEPGRAM_API_KEY=your-api-key -- deepgram-mcp`	`uvx elevenlabs-mcp`	`claude mcp add groq -- uvx groq-mcp`
Config	`Manual-only setup: pip install funasr`	`{ "mcpServers": { "deepgram": { "command": "deepgram-mcp", "env": { "DEEPGRAM_API_KEY": "your-api-key" } } } }`	`Manual-only setup: claude mcp add elevenlabs --env ELEVENLABS_API_KEY=YOUR_ELEVENLABS_API_KEY -- uvx elevenlabs-mcp`	`{ "mcpServers": { "groq": { "command": "uvx", "args": ["groq-mcp"], "env": { "GROQ_API_KEY": "your-api-key", "BASE_OUTPUT_PATH": "/path/to/output/directory" } } } }`
Citations	Source repositorygithub.com 2026-07-21T07:17:01+00:00 Documentationraw.githubusercontent.com Submitted by oktofeesh12026-06-06 Source methodology →	Source repositorygithub.com 2026-07-21T07:17:01+00:00 Documentationgithub.com Source methodology →	Source repositorygithub.com 2026-07-21T07:17:01+00:00 Documentationraw.githubusercontent.com Submitted by oktofeesh12026-06-06 Source methodology →	Source repositorygithub.com 2026-07-21T07:17:01+00:00 Documentationgithub.com Source methodology →
Claim	Unclaimed	Unclaimed	Unclaimed	Unclaimed

Open 4 picks in the interactive comparison tool

Related guides

Source-backed guides for putting this to work.

Build Claude MCP Servers

Master MCP server development from scratch.

Added 8mo ago

guides Review first Source-backed Review first

Safety ✓ Privacy ✓by JSONbored

Signals

Loading live community signals…

Citation facts

Review trust signals before you adopt

Source and provenance checks

Safety and privacy checks

Package and install checks

Compare-driven decision checks

CLI install

Balanced adoption plan

Pre-adoption checks

Security checks

Rollout

Evidence readiness matrix · balanced

Source provenance

Metadata review

Safety notes

Privacy notes

Package integrity

Install payload

Decision timeline · balanced

Confirm source provenanceRequired

Check metadata review statusRequired

Review safety notesRequired

Review privacy notes

Validate package integrity metadata

Verify install payload and commandsRequired

Prerequisite readiness

Safety & privacy surface

Safety notes

Privacy notes

Prerequisites

Schema details

About this resource

Content

Source Review

Features

Installation

Use Cases

Safety and Privacy

Duplicate Check

Source citations

Add this badge to your README

How it compares

Related resources

Deepgram MCP Server for Claude

ElevenLabs MCP Server

Groq MCP Server for Claude

ChunkHound MCP Server

Related guides

Build Claude MCP Servers

Signals