mcpSource-backedReview first Safety ✓ Privacy ✓

inference.sh MCP Server

Hosted streamable-HTTP MCP server that exposes inference.sh platform tools for running apps, managing tasks, proxying external MCP connectors, and calling hundreds of hosted AI models from Claude Code, Cursor, and other MCP clients.

by inference.sh·added 2026-06-22·

Claude Code Cursor Claude Desktop

HarnessClaude CodeCursorClaude Desktop

Install

Source

Run `belt login` to create an inference.sh API key, then add the streamable HTTP MCP endpoint `https://api.inference.sh/mcp` with `Authorization: Bearer inf_<your_api_key>` in your MCP client settings.

Readiness

TrustReview first
Sourcesource-backed
Safety notesPresent
ReviewedYes

Documentation Source repository Registry JSON · LLM text

Review first — review before installing

Open the source and read safety notes before installing.

Citation facts

Source-backed facts for citing this resource, derived directly from the registry — also available as plain text for AI assistants.

Canonical URL: https://heyclau.de/entry/mcp/inference-sh-mcp-server
Source URLs: https://inference.sh/docs/connectors/mcp-server, https://github.com/inference-sh/grid, https://inference.sh
Brand: inference.sh
Brand domain: inference.sh
Brand asset source: brandfetch
Safety notes: The hosted MCP server can execute inference.sh apps, platform tasks, and proxied connector tools that may create, update, or delete data in connected services such as GitHub, Linear, Slack, Notion, or databases., Bearer API keys grant account-scoped access to platform capabilities; rotate compromised keys immediately and avoid committing tokens to repositories or shared MCP config files., MCP proxy features can discover and call tools on remote MCP servers through inference.sh, which expands the runtime trust boundary beyond the primary client configuration., Tool calls may trigger paid inference, connector actions, storage writes, or outbound network requests according to the selected app or connector., Use least-privilege connector authorization and review each proxied server before enabling it in production agent workflows.
Privacy notes: Prompts, files, tool arguments, task metadata, connector payloads, and model outputs are processed by inference.sh and may transit connected third-party MCP services., OAuth-backed connectors can expose account, workspace, issue, message, or repository content to the agent through proxied tool results., API keys, connector tokens, and task logs should be treated as sensitive credentials and kept out of version control and public issue threads., Hosted execution may retain usage, billing, and operational telemetry according to inference.sh policies; review the platform privacy documentation before processing regulated data.
Author: inference.sh
Submitted by: kiannidev
Claim status: unclaimed
Last verified: 2026-06-22

Safety notes

The hosted MCP server can execute inference.sh apps, platform tasks, and proxied connector tools that may create, update, or delete data in connected services such as GitHub, Linear, Slack, Notion, or databases.
Bearer API keys grant account-scoped access to platform capabilities; rotate compromised keys immediately and avoid committing tokens to repositories or shared MCP config files.
MCP proxy features can discover and call tools on remote MCP servers through inference.sh, which expands the runtime trust boundary beyond the primary client configuration.
Tool calls may trigger paid inference, connector actions, storage writes, or outbound network requests according to the selected app or connector.
Use least-privilege connector authorization and review each proxied server before enabling it in production agent workflows.

Privacy notes

Prompts, files, tool arguments, task metadata, connector payloads, and model outputs are processed by inference.sh and may transit connected third-party MCP services.
OAuth-backed connectors can expose account, workspace, issue, message, or repository content to the agent through proxied tool results.
API keys, connector tokens, and task logs should be treated as sensitive credentials and kept out of version control and public issue threads.
Hosted execution may retain usage, billing, and operational telemetry according to inference.sh policies; review the platform privacy documentation before processing regulated data.

Prerequisites

An inference.sh account and API key from `belt login` or the platform dashboard.
An MCP client that supports streamable HTTP transport, such as Claude Code, Cursor, Cline, or Windsurf.
Review of which inference.sh apps, connectors, and proxy tools the agent may call before enabling write-capable workflows.
A billing and quota plan if the workflow will run image, video, LLM, search, or connector-backed tasks at scale.

Schema details

Install type: cli
Troubleshooting: No

Source repository stats

Scope: Source repo

Collection metadata

Estimated setup: 10 minutes
Difficulty: intermediate

Tool listing metadata

Website: https://inference.sh
Disclosure: Hosted commercial AI agent runtime with a public MCP endpoint at `https://api.inference.sh/mcp`. This entry documents the official inference.sh platform MCP server, not a community reimplementation.

Full copyable content

{
  "mcpServers": {
    "inference": {
      "type": "streamable-http",
      "url": "https://api.inference.sh/mcp",
      "headers": {
        "Authorization": "Bearer inf_your_api_key"
      }
    }
  }
}

About this resource

Content

inference.sh exposes a hosted Model Context Protocol server at https://api.inference.sh/mcp. MCP clients such as Claude Code and Cursor can connect over streamable HTTP with a Bearer API key and use platform tools to run apps, manage tasks, and access inference.sh capabilities without running local MCP processes.

The platform also supports MCP in the opposite direction: inference.sh can connect to external MCP servers and surface their tools through connector proxies. That makes it useful when an agent needs both hosted model/app execution and third-party service tools from one runtime.

Source Review

Install

Install the inference.sh CLI (belt) and authenticate:

belt login

Copy the generated API key (inf_...) and add the MCP server to your client settings.
For Claude Code or Cursor, use streamable HTTP transport:

{
  "mcpServers": {
    "inference": {
      "type": "streamable-http",
      "url": "https://api.inference.sh/mcp",
      "headers": {
        "Authorization": "Bearer inf_your_api_key"
      }
    }
  }
}

Restart the MCP client, list available tools, and confirm initialize and tools/list succeed before enabling autonomous workflows.

Duplicate Check

Searched content/mcp/, open PRs, and the live registry for:

inference.sh, inference-sh, sh.inference, api.inference.sh
slug inference-sh-mcp-server
docs URL inference.sh/docs/connectors/mcp-server
MCP server card host api.inference.sh

No existing HeyClaude MCP entry documents this hosted inference.sh endpoint.

Runtime Notes

Transport: streamable HTTP (JSON-RPC 2.0) at POST https://api.inference.sh/mcp
Discovery card: GET https://api.inference.sh/.well-known/mcp-server-card
Supported protocol versions include 2025-11-25, 2025-06-18, and 2025-03-26
Unauthenticated requests to the MCP endpoint return 401, so clients must send a valid Bearer token

When To Use

You want Claude or another MCP client to call inference.sh apps and hosted models without maintaining local MCP server processes.
You need connector proxy access to services like GitHub, Linear, Slack, or Notion through inference.sh-managed authentication.
You are evaluating a hosted runtime that combines app execution, task orchestration, and external MCP connectors behind one API key.

When Not To Use

You require fully offline or air-gapped MCP execution.
You cannot accept cloud processing of prompts, connector data, or generated media.
You need a self-hosted open-source MCP server repository instead of a hosted platform endpoint.

#inference #hosted-mcp #streamable-http #connectors #ai-runtime

Source citations

Source methodology →

Add this badge to your README

Show that inference.sh MCP Server is listed on HeyClaude. Paste this Markdown into your README — it renders the badge and links back to this page.

[![Listed on HeyClaude](https://heyclau.de/badge/mcp/inference-sh-mcp-server.svg)](https://heyclau.de/entry/mcp/inference-sh-mcp-server)

How it compares

inference.sh MCP Server side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.

Field	inference.sh MCP Server Hosted streamable-HTTP MCP server that exposes inference.sh platform tools for running apps, managing tasks, proxying external MCP connectors, and calling hundreds of hosted AI models from Claude Code, Cursor, and other MCP clients. Open dossier	Dagu MCP Server Built-in Streamable HTTP MCP server for Dagu that lets AI agents read workflow state, inspect DAG specs and logs, preview or apply workflow changes, and start, enqueue, retry, or stop DAG runs. Open dossier	mcp-proxy Transport Bridge MCP transport bridge that converts between stdio, SSE, and Streamable HTTP so local MCP clients can reach remote servers, or remote clients can reach local stdio servers. Open dossier	Nuclear MCP Server Built-in Streamable HTTP MCP server for Nuclear Music Player that lets Claude inspect available music-player domains, discover method signatures, describe data types, and control playback, queue, favorites, playlists, dashboard, and provider workflows. Open dossier
Trust
Install risk	Review first	Review first	Review first	Review first
Notes	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓
Brand	inference.sh	Dagu	—	Nuclear
Category	mcp	mcp	mcp	mcp
Source	source-backed	source-backed	source-backed	source-backed
Author	inference.sh	dagucloud	Sergey Parfenyuk	Nuclear
Added	2026-06-22	2026-06-06	2026-06-06	2026-06-06
Platforms	Claude CodeCursorClaude Desktop	Claude CodeClaude Desktop	Claude CodeClaude Desktop	Claude CodeClaude Desktop
Source repo	—	—	—	—
Safety notes	✓The hosted MCP server can execute inference.sh apps, platform tasks, and proxied connector tools that may create, update, or delete data in connected services such as GitHub, Linear, Slack, Notion, or databases. Bearer API keys grant account-scoped access to platform capabilities; rotate compromised keys immediately and avoid committing tokens to repositories or shared MCP config files. MCP proxy features can discover and call tools on remote MCP servers through inference.sh, which expands the runtime trust boundary beyond the primary client configuration. Tool calls may trigger paid inference, connector actions, storage writes, or outbound network requests according to the selected app or connector. Use least-privilege connector authorization and review each proxied server before enabling it in production agent workflows.	✓Dagu is a workflow engine; MCP access can expose shell commands, Docker containers, Kubernetes Jobs, SSH commands, SQL queries, HTTP calls, agent harnesses, and other workflow steps. The MCP server registers `dagu_change` and `dagu_execute` as destructive-capable tools, so require preview/review before applying DAG changes or starting/stopping production runs. API keys can be scoped to the MCP surface; avoid reusing broad admin credentials for agent access. Workflow edits may change schedules, parameters, retries, secrets usage, queues, resource limits, notifications, and downstream infrastructure actions. Keep the Dagu server and MCP endpoint behind trusted network boundaries, TLS, and authentication for shared or remote deployments.	✓mcp-proxy can expose local stdio MCP servers as network services; keep the host bound to `127.0.0.1` unless remote access is intentional. Passing `--host=0.0.0.0`, permissive CORS, or named-server routes can make tools reachable by other systems on the network. Proxy configuration can include bearer tokens, OAuth client secrets, headers, server commands, environment variables, and working directories. The proxy can spawn arbitrary configured MCP server commands; only use trusted command strings and config files. Remote SSE or Streamable HTTP servers should be authenticated and trusted before forwarding client requests or tool outputs.	✓Nuclear MCP Server runs inside the local Nuclear desktop app and exposes a Streamable HTTP server on the localhost interface. The `call` tool can execute Nuclear API methods after discovery through `list_methods`, `method_details`, and `describe_type`. Available domains include Queue, Playback, Metadata, Favorites, Playlists, Dashboard, and Providers, so agents can change what is playing and modify local music-player state. Nuclear's plugin and provider system can retrieve streaming sources, metadata, playlists, and dashboard content from third-party services; use providers only where automated access is allowed. Keep the server bound to localhost, avoid exposing the MCP endpoint on a network interface, and require confirmation before letting an agent change playlists, favorites, queues, or provider settings.
Privacy notes	✓Prompts, files, tool arguments, task metadata, connector payloads, and model outputs are processed by inference.sh and may transit connected third-party MCP services. OAuth-backed connectors can expose account, workspace, issue, message, or repository content to the agent through proxied tool results. API keys, connector tokens, and task logs should be treated as sensitive credentials and kept out of version control and public issue threads. Hosted execution may retain usage, billing, and operational telemetry according to inference.sh policies; review the platform privacy documentation before processing regulated data.	✓DAG specs, run parameters, logs, documents, audit records, secrets references, API keys, environment variables, and workflow outputs may be exposed to the MCP client. Workflow logs can contain credentials, customer data, internal hostnames, database query results, command output, file paths, or incident context. Dagu stores state locally by default and can also run distributed workers; review where DAG files, logs, audit entries, and secrets are persisted. Any workflow state returned through the MCP client may be sent onward to the configured model provider.	✓MCP requests, responses, tool outputs, progress events, headers, OAuth tokens, API access tokens, and session identifiers may pass through the proxy process. Named server config files can contain command arguments, environment variables, and credentials for downstream MCP servers. Exposed network endpoints can reveal tool names, server status, and results to clients that can reach the proxy. Logs and troubleshooting output may include endpoint URLs, command names, headers, connection errors, or server names. Store config files outside shared repositories when they include tokens or private server details.	✓Tool calls and transcripts can include listening history, search terms, artists, albums, track titles, playlist names, favorites, provider choices, dashboard content, and local player settings. The MCP endpoint is local, but connected MCP clients, model providers, logs, screenshots, and shared chat transcripts can still retain music-library and listening-behavior data. Streaming and metadata providers may receive searches, track identifiers, IP addresses, user-agent metadata, or plugin-specific account context according to their own policies. The MCP server URL and port are local connection details; do not publish screenshots or logs that include private player state or provider credentials.
Prerequisites	An inference.sh account and API key from `belt login` or the platform dashboard. An MCP client that supports streamable HTTP transport, such as Claude Code, Cursor, Cline, or Windsurf. Review of which inference.sh apps, connectors, and proxy tools the agent may call before enabling write-capable workflows. A billing and quota plan if the workflow will run image, video, LLM, search, or connector-backed tasks at scale.	Dagu installed from Homebrew, GitHub Releases, npm, Docker/GHCR, Helm, or another upstream-supported installation path. A running Dagu HTTP server with the built-in MCP endpoint enabled through the normal server path. MCP client that supports Streamable HTTP server configuration. API key with MCP surface access when Dagu authentication is enabled.	Python 3.10 or newer. uv, pipx, or Docker for installation. A known MCP endpoint or local stdio MCP server to bridge. Review of required headers, OAuth client credentials, CORS origins, host, port, and named-server configuration.	Nuclear Music Player installed from the project's releases or platform packages. MCP server enabled in Nuclear under Settings > Integrations. MCP client support for Streamable HTTP or remote URL based server configuration. Review of the actual local URL shown by Nuclear because the server starts on ports 8800 through 8809.
Install	Run `belt login` to create an inference.sh API key, then add the streamable HTTP MCP endpoint `https://api.inference.sh/mcp` with `Authorization: Bearer inf_<your_api_key>` in your MCP client settings.	`brew install dagu`	`uv tool install mcp-proxy`	`claude mcp add nuclear --transport http <copy-url-from-nuclear-settings>`
Config	`{ "mcpServers": { "inference": { "type": "streamable-http", "url": "https://api.inference.sh/mcp", "headers": { "Authorization": "Bearer inf_your_api_key" } } } }`	`{ "mcpServers": { "dagu": { "url": "LOCAL_DAGU_MCP_URL", "headers": { "Authorization": "Bearer DAGU_MCP_API_KEY" } } } }`	`Manual-only setup: mcp-proxy https://example.com/sse`	`Manual-only setup: claude mcp add nuclear --transport http <copy-url-from-nuclear-settings>`
Citations	Source repositorygithub.com 2026-06-22T15:28:00+00:00 Documentationinference.sh Websiteinference.sh Submitted by kiannidev2026-06-22 Source methodology →	Source repositorygithub.com 2026-06-22T15:28:00+00:00 Documentationdocs.dagu.sh Submitted by oktofeesh12026-06-06 Source methodology →	Source repositorygithub.com 2026-06-22T15:28:00+00:00 Documentationraw.githubusercontent.com Submitted by oktofeesh12026-06-06 Source methodology →	Source repositorygithub.com 2026-06-22T15:28:00+00:00 Documentationraw.githubusercontent.com Submitted by oktofeesh12026-06-06 Source methodology →
Claim	Unclaimed	Unclaimed	Unclaimed	Unclaimed

Signals

Loading live community signals…

Citation facts

Safety notes

Privacy notes

Prerequisites

Schema details

About this resource

Content

Source Review

Install

Duplicate Check

Runtime Notes

When To Use

When Not To Use

Source citations

Add this badge to your README

How it compares

Related resources

Dagu MCP Server

mcp-proxy Transport Bridge

Nuclear MCP Server

Unla MCP Gateway

Signals