inference.sh MCP Server
Hosted streamable-HTTP MCP server that exposes inference.sh platform tools for running apps, managing tasks, proxying external MCP connectors, and calling hundreds of hosted AI models from Claude Code, Cursor, and other MCP clients.
Open the source and read safety notes before installing.
Citation facts
Source-backed facts for citing this resource, derived directly from the registry — also available as plain text for AI assistants.
- Canonical URL
- https://heyclau.de/entry/mcp/inference-sh-mcp-server
- Source URLs
- https://inference.sh/docs/connectors/mcp-server, https://github.com/inference-sh/grid, https://inference.sh
- Brand
- inference.sh
- Brand domain
- inference.sh
- Brand asset source
- brandfetch
- Safety notes
- The hosted MCP server can execute inference.sh apps, platform tasks, and proxied connector tools that may create, update, or delete data in connected services such as GitHub, Linear, Slack, Notion, or databases., Bearer API keys grant account-scoped access to platform capabilities; rotate compromised keys immediately and avoid committing tokens to repositories or shared MCP config files., MCP proxy features can discover and call tools on remote MCP servers through inference.sh, which expands the runtime trust boundary beyond the primary client configuration., Tool calls may trigger paid inference, connector actions, storage writes, or outbound network requests according to the selected app or connector., Use least-privilege connector authorization and review each proxied server before enabling it in production agent workflows.
- Privacy notes
- Prompts, files, tool arguments, task metadata, connector payloads, and model outputs are processed by inference.sh and may transit connected third-party MCP services., OAuth-backed connectors can expose account, workspace, issue, message, or repository content to the agent through proxied tool results., API keys, connector tokens, and task logs should be treated as sensitive credentials and kept out of version control and public issue threads., Hosted execution may retain usage, billing, and operational telemetry according to inference.sh policies; review the platform privacy documentation before processing regulated data.
- Author
- inference.sh
- Submitted by
- kiannidev
- Claim status
- unclaimed
- Last verified
- 2026-06-22
Safety notes
- The hosted MCP server can execute inference.sh apps, platform tasks, and proxied connector tools that may create, update, or delete data in connected services such as GitHub, Linear, Slack, Notion, or databases.
- Bearer API keys grant account-scoped access to platform capabilities; rotate compromised keys immediately and avoid committing tokens to repositories or shared MCP config files.
- MCP proxy features can discover and call tools on remote MCP servers through inference.sh, which expands the runtime trust boundary beyond the primary client configuration.
- Tool calls may trigger paid inference, connector actions, storage writes, or outbound network requests according to the selected app or connector.
- Use least-privilege connector authorization and review each proxied server before enabling it in production agent workflows.
Privacy notes
- Prompts, files, tool arguments, task metadata, connector payloads, and model outputs are processed by inference.sh and may transit connected third-party MCP services.
- OAuth-backed connectors can expose account, workspace, issue, message, or repository content to the agent through proxied tool results.
- API keys, connector tokens, and task logs should be treated as sensitive credentials and kept out of version control and public issue threads.
- Hosted execution may retain usage, billing, and operational telemetry according to inference.sh policies; review the platform privacy documentation before processing regulated data.
Prerequisites
- An inference.sh account and API key from `belt login` or the platform dashboard.
- An MCP client that supports streamable HTTP transport, such as Claude Code, Cursor, Cline, or Windsurf.
- Review of which inference.sh apps, connectors, and proxy tools the agent may call before enabling write-capable workflows.
- A billing and quota plan if the workflow will run image, video, LLM, search, or connector-backed tasks at scale.
Schema details
- Install type
- cli
- Troubleshooting
- No
- Scope
- Source repo
- Estimated setup
- 10 minutes
- Difficulty
- intermediate
- Website
- https://inference.sh
- Disclosure
- Hosted commercial AI agent runtime with a public MCP endpoint at `https://api.inference.sh/mcp`. This entry documents the official inference.sh platform MCP server, not a community reimplementation.
Full copyable content
{
"mcpServers": {
"inference": {
"type": "streamable-http",
"url": "https://api.inference.sh/mcp",
"headers": {
"Authorization": "Bearer inf_your_api_key"
}
}
}
}About this resource
Content
inference.sh exposes a hosted Model Context Protocol server at
https://api.inference.sh/mcp. MCP clients such as Claude Code and Cursor can
connect over streamable HTTP with a Bearer API key and use platform tools to run
apps, manage tasks, and access inference.sh capabilities without running local
MCP processes.
The platform also supports MCP in the opposite direction: inference.sh can connect to external MCP servers and surface their tools through connector proxies. That makes it useful when an agent needs both hosted model/app execution and third-party service tools from one runtime.
Source Review
- https://inference.sh/docs/connectors/mcp-server
- https://api.inference.sh/.well-known/mcp-server-card
- https://inference.sh/blog/guides/mcp-on-inference-sh
- https://github.com/inference-sh/grid
Install
- Install the inference.sh CLI (
belt) and authenticate:
belt login
Copy the generated API key (
inf_...) and add the MCP server to your client settings.For Claude Code or Cursor, use streamable HTTP transport:
{
"mcpServers": {
"inference": {
"type": "streamable-http",
"url": "https://api.inference.sh/mcp",
"headers": {
"Authorization": "Bearer inf_your_api_key"
}
}
}
}
- Restart the MCP client, list available tools, and confirm
initializeandtools/listsucceed before enabling autonomous workflows.
Duplicate Check
Searched content/mcp/, open PRs, and the live registry for:
inference.sh,inference-sh,sh.inference,api.inference.sh- slug
inference-sh-mcp-server - docs URL
inference.sh/docs/connectors/mcp-server - MCP server card host
api.inference.sh
No existing HeyClaude MCP entry documents this hosted inference.sh endpoint.
Runtime Notes
- Transport: streamable HTTP (JSON-RPC 2.0) at
POST https://api.inference.sh/mcp - Discovery card:
GET https://api.inference.sh/.well-known/mcp-server-card - Supported protocol versions include
2025-11-25,2025-06-18, and2025-03-26 - Unauthenticated requests to the MCP endpoint return
401, so clients must send a valid Bearer token
When To Use
- You want Claude or another MCP client to call inference.sh apps and hosted models without maintaining local MCP server processes.
- You need connector proxy access to services like GitHub, Linear, Slack, or Notion through inference.sh-managed authentication.
- You are evaluating a hosted runtime that combines app execution, task orchestration, and external MCP connectors behind one API key.
When Not To Use
- You require fully offline or air-gapped MCP execution.
- You cannot accept cloud processing of prompts, connector data, or generated media.
- You need a self-hosted open-source MCP server repository instead of a hosted platform endpoint.
Source citations
Add this badge to your README
How it compares
inference.sh MCP Server side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.
| Field | Hosted streamable-HTTP MCP server that exposes inference.sh platform tools for running apps, managing tasks, proxying external MCP connectors, and calling hundreds of hosted AI models from Claude Code, Cursor, and other MCP clients. Open dossier | Built-in Streamable HTTP MCP server for Dagu that lets AI agents read workflow state, inspect DAG specs and logs, preview or apply workflow changes, and start, enqueue, retry, or stop DAG runs. Open dossier | MCP transport bridge that converts between stdio, SSE, and Streamable HTTP so local MCP clients can reach remote servers, or remote clients can reach local stdio servers. Open dossier | Built-in Streamable HTTP MCP server for Nuclear Music Player that lets Claude inspect available music-player domains, discover method signatures, describe data types, and control playback, queue, favorites, playlists, dashboard, and provider workflows. Open dossier |
|---|---|---|---|---|
| Trust | ||||
| Install risk | Review first | Review first | Review first | Review first |
| Notes | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ |
| Brand | — | |||
| Category | mcp | mcp | mcp | mcp |
| Source | source-backed | source-backed | source-backed | source-backed |
| Author | inference.sh | dagucloud | Sergey Parfenyuk | Nuclear |
| Added | 2026-06-22 | 2026-06-06 | 2026-06-06 | 2026-06-06 |
| Platforms | Claude CodeCursorClaude Desktop | Claude CodeClaude Desktop | Claude CodeClaude Desktop | Claude CodeClaude Desktop |
| Source repo | — | — | — | — |
| Safety notes | ✓The hosted MCP server can execute inference.sh apps, platform tasks, and proxied connector tools that may create, update, or delete data in connected services such as GitHub, Linear, Slack, Notion, or databases. Bearer API keys grant account-scoped access to platform capabilities; rotate compromised keys immediately and avoid committing tokens to repositories or shared MCP config files. MCP proxy features can discover and call tools on remote MCP servers through inference.sh, which expands the runtime trust boundary beyond the primary client configuration. Tool calls may trigger paid inference, connector actions, storage writes, or outbound network requests according to the selected app or connector. Use least-privilege connector authorization and review each proxied server before enabling it in production agent workflows. | ✓Dagu is a workflow engine; MCP access can expose shell commands, Docker containers, Kubernetes Jobs, SSH commands, SQL queries, HTTP calls, agent harnesses, and other workflow steps. The MCP server registers `dagu_change` and `dagu_execute` as destructive-capable tools, so require preview/review before applying DAG changes or starting/stopping production runs. API keys can be scoped to the MCP surface; avoid reusing broad admin credentials for agent access. Workflow edits may change schedules, parameters, retries, secrets usage, queues, resource limits, notifications, and downstream infrastructure actions. Keep the Dagu server and MCP endpoint behind trusted network boundaries, TLS, and authentication for shared or remote deployments. | ✓mcp-proxy can expose local stdio MCP servers as network services; keep the host bound to `127.0.0.1` unless remote access is intentional. Passing `--host=0.0.0.0`, permissive CORS, or named-server routes can make tools reachable by other systems on the network. Proxy configuration can include bearer tokens, OAuth client secrets, headers, server commands, environment variables, and working directories. The proxy can spawn arbitrary configured MCP server commands; only use trusted command strings and config files. Remote SSE or Streamable HTTP servers should be authenticated and trusted before forwarding client requests or tool outputs. | ✓Nuclear MCP Server runs inside the local Nuclear desktop app and exposes a Streamable HTTP server on the localhost interface. The `call` tool can execute Nuclear API methods after discovery through `list_methods`, `method_details`, and `describe_type`. Available domains include Queue, Playback, Metadata, Favorites, Playlists, Dashboard, and Providers, so agents can change what is playing and modify local music-player state. Nuclear's plugin and provider system can retrieve streaming sources, metadata, playlists, and dashboard content from third-party services; use providers only where automated access is allowed. Keep the server bound to localhost, avoid exposing the MCP endpoint on a network interface, and require confirmation before letting an agent change playlists, favorites, queues, or provider settings. |
| Privacy notes | ✓Prompts, files, tool arguments, task metadata, connector payloads, and model outputs are processed by inference.sh and may transit connected third-party MCP services. OAuth-backed connectors can expose account, workspace, issue, message, or repository content to the agent through proxied tool results. API keys, connector tokens, and task logs should be treated as sensitive credentials and kept out of version control and public issue threads. Hosted execution may retain usage, billing, and operational telemetry according to inference.sh policies; review the platform privacy documentation before processing regulated data. | ✓DAG specs, run parameters, logs, documents, audit records, secrets references, API keys, environment variables, and workflow outputs may be exposed to the MCP client. Workflow logs can contain credentials, customer data, internal hostnames, database query results, command output, file paths, or incident context. Dagu stores state locally by default and can also run distributed workers; review where DAG files, logs, audit entries, and secrets are persisted. Any workflow state returned through the MCP client may be sent onward to the configured model provider. | ✓MCP requests, responses, tool outputs, progress events, headers, OAuth tokens, API access tokens, and session identifiers may pass through the proxy process. Named server config files can contain command arguments, environment variables, and credentials for downstream MCP servers. Exposed network endpoints can reveal tool names, server status, and results to clients that can reach the proxy. Logs and troubleshooting output may include endpoint URLs, command names, headers, connection errors, or server names. Store config files outside shared repositories when they include tokens or private server details. | ✓Tool calls and transcripts can include listening history, search terms, artists, albums, track titles, playlist names, favorites, provider choices, dashboard content, and local player settings. The MCP endpoint is local, but connected MCP clients, model providers, logs, screenshots, and shared chat transcripts can still retain music-library and listening-behavior data. Streaming and metadata providers may receive searches, track identifiers, IP addresses, user-agent metadata, or plugin-specific account context according to their own policies. The MCP server URL and port are local connection details; do not publish screenshots or logs that include private player state or provider credentials. |
| Prerequisites |
|
|
|
|
| Install | | | | |
| Config | | | | |
| Citations | ||||
| Claim | Unclaimed | Unclaimed | Unclaimed | Unclaimed |
Signals
Loading live community signals…
A short, calm digest of reviewed Claude resources. Unsubscribe any time.