Skip to main content
mcpSource-backedReview first Safety Privacy
inference.sh logo

inference.sh MCP Server

Hosted streamable-HTTP MCP server that exposes inference.sh platform tools for running apps, managing tasks, proxying external MCP connectors, and calling hundreds of hosted AI models from Claude Code, Cursor, and other MCP clients.

by inference.sh·added 2026-06-22·
HarnessClaude CodeCursorClaude Desktop
Review first review before installing

Open the source and read safety notes before installing.

Citation facts

Source-backed facts for citing this resource, derived directly from the registry — also available as plain text for AI assistants.

Source URLs
https://inference.sh/docs/connectors/mcp-server, https://github.com/inference-sh/grid, https://inference.sh
Brand
inference.sh
Brand domain
inference.sh
Brand asset source
brandfetch
Safety notes
The hosted MCP server can execute inference.sh apps, platform tasks, and proxied connector tools that may create, update, or delete data in connected services such as GitHub, Linear, Slack, Notion, or databases., Bearer API keys grant account-scoped access to platform capabilities; rotate compromised keys immediately and avoid committing tokens to repositories or shared MCP config files., MCP proxy features can discover and call tools on remote MCP servers through inference.sh, which expands the runtime trust boundary beyond the primary client configuration., Tool calls may trigger paid inference, connector actions, storage writes, or outbound network requests according to the selected app or connector., Use least-privilege connector authorization and review each proxied server before enabling it in production agent workflows.
Privacy notes
Prompts, files, tool arguments, task metadata, connector payloads, and model outputs are processed by inference.sh and may transit connected third-party MCP services., OAuth-backed connectors can expose account, workspace, issue, message, or repository content to the agent through proxied tool results., API keys, connector tokens, and task logs should be treated as sensitive credentials and kept out of version control and public issue threads., Hosted execution may retain usage, billing, and operational telemetry according to inference.sh policies; review the platform privacy documentation before processing regulated data.
Author
inference.sh
Submitted by
kiannidev
Claim status
unclaimed
Last verified
2026-06-22

Safety notes

  • The hosted MCP server can execute inference.sh apps, platform tasks, and proxied connector tools that may create, update, or delete data in connected services such as GitHub, Linear, Slack, Notion, or databases.
  • Bearer API keys grant account-scoped access to platform capabilities; rotate compromised keys immediately and avoid committing tokens to repositories or shared MCP config files.
  • MCP proxy features can discover and call tools on remote MCP servers through inference.sh, which expands the runtime trust boundary beyond the primary client configuration.
  • Tool calls may trigger paid inference, connector actions, storage writes, or outbound network requests according to the selected app or connector.
  • Use least-privilege connector authorization and review each proxied server before enabling it in production agent workflows.

Privacy notes

  • Prompts, files, tool arguments, task metadata, connector payloads, and model outputs are processed by inference.sh and may transit connected third-party MCP services.
  • OAuth-backed connectors can expose account, workspace, issue, message, or repository content to the agent through proxied tool results.
  • API keys, connector tokens, and task logs should be treated as sensitive credentials and kept out of version control and public issue threads.
  • Hosted execution may retain usage, billing, and operational telemetry according to inference.sh policies; review the platform privacy documentation before processing regulated data.

Prerequisites

  • An inference.sh account and API key from `belt login` or the platform dashboard.
  • An MCP client that supports streamable HTTP transport, such as Claude Code, Cursor, Cline, or Windsurf.
  • Review of which inference.sh apps, connectors, and proxy tools the agent may call before enabling write-capable workflows.
  • A billing and quota plan if the workflow will run image, video, LLM, search, or connector-backed tasks at scale.

Schema details

Install type
cli
Troubleshooting
No
Source repository stats
Scope
Source repo
Collection metadata
Estimated setup
10 minutes
Difficulty
intermediate
Tool listing metadata
Disclosure
Hosted commercial AI agent runtime with a public MCP endpoint at `https://api.inference.sh/mcp`. This entry documents the official inference.sh platform MCP server, not a community reimplementation.
Full copyable content
{
  "mcpServers": {
    "inference": {
      "type": "streamable-http",
      "url": "https://api.inference.sh/mcp",
      "headers": {
        "Authorization": "Bearer inf_your_api_key"
      }
    }
  }
}

About this resource

Content

inference.sh exposes a hosted Model Context Protocol server at https://api.inference.sh/mcp. MCP clients such as Claude Code and Cursor can connect over streamable HTTP with a Bearer API key and use platform tools to run apps, manage tasks, and access inference.sh capabilities without running local MCP processes.

The platform also supports MCP in the opposite direction: inference.sh can connect to external MCP servers and surface their tools through connector proxies. That makes it useful when an agent needs both hosted model/app execution and third-party service tools from one runtime.

Source Review

Install

  1. Install the inference.sh CLI (belt) and authenticate:
belt login
  1. Copy the generated API key (inf_...) and add the MCP server to your client settings.

  2. For Claude Code or Cursor, use streamable HTTP transport:

{
  "mcpServers": {
    "inference": {
      "type": "streamable-http",
      "url": "https://api.inference.sh/mcp",
      "headers": {
        "Authorization": "Bearer inf_your_api_key"
      }
    }
  }
}
  1. Restart the MCP client, list available tools, and confirm initialize and tools/list succeed before enabling autonomous workflows.

Duplicate Check

Searched content/mcp/, open PRs, and the live registry for:

  • inference.sh, inference-sh, sh.inference, api.inference.sh
  • slug inference-sh-mcp-server
  • docs URL inference.sh/docs/connectors/mcp-server
  • MCP server card host api.inference.sh

No existing HeyClaude MCP entry documents this hosted inference.sh endpoint.

Runtime Notes

  • Transport: streamable HTTP (JSON-RPC 2.0) at POST https://api.inference.sh/mcp
  • Discovery card: GET https://api.inference.sh/.well-known/mcp-server-card
  • Supported protocol versions include 2025-11-25, 2025-06-18, and 2025-03-26
  • Unauthenticated requests to the MCP endpoint return 401, so clients must send a valid Bearer token

When To Use

  • You want Claude or another MCP client to call inference.sh apps and hosted models without maintaining local MCP server processes.
  • You need connector proxy access to services like GitHub, Linear, Slack, or Notion through inference.sh-managed authentication.
  • You are evaluating a hosted runtime that combines app execution, task orchestration, and external MCP connectors behind one API key.

When Not To Use

  • You require fully offline or air-gapped MCP execution.
  • You cannot accept cloud processing of prompts, connector data, or generated media.
  • You need a self-hosted open-source MCP server repository instead of a hosted platform endpoint.

Source citations

Add this badge to your README

Show that inference.sh MCP Server is listed on HeyClaude. Paste this Markdown into your README — it renders the badge and links back to this page.

Listed on HeyClaude
[![Listed on HeyClaude](https://heyclau.de/badge/mcp/inference-sh-mcp-server.svg)](https://heyclau.de/entry/mcp/inference-sh-mcp-server)

How it compares

inference.sh MCP Server side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.

Field

Hosted streamable-HTTP MCP server that exposes inference.sh platform tools for running apps, managing tasks, proxying external MCP connectors, and calling hundreds of hosted AI models from Claude Code, Cursor, and other MCP clients.

Open dossier

Built-in Streamable HTTP MCP server for Dagu that lets AI agents read workflow state, inspect DAG specs and logs, preview or apply workflow changes, and start, enqueue, retry, or stop DAG runs.

Open dossier

MCP transport bridge that converts between stdio, SSE, and Streamable HTTP so local MCP clients can reach remote servers, or remote clients can reach local stdio servers.

Open dossier

Built-in Streamable HTTP MCP server for Nuclear Music Player that lets Claude inspect available music-player domains, discover method signatures, describe data types, and control playback, queue, favorites, playlists, dashboard, and provider workflows.

Open dossier
Trust
Install riskReview firstReview firstReview firstReview first
Notes Safety Privacy Safety Privacy Safety Privacy Safety Privacy
Brandinference.sh logoinference.shDagu logoDaguNuclear logoNuclear
Categorymcpmcpmcpmcp
Sourcesource-backedsource-backedsource-backedsource-backed
Authorinference.shdagucloudSergey ParfenyukNuclear
Added2026-06-222026-06-062026-06-062026-06-06
Platforms
Claude CodeCursorClaude Desktop
Claude CodeClaude Desktop
Claude CodeClaude Desktop
Claude CodeClaude Desktop
Source repo
Safety notesThe hosted MCP server can execute inference.sh apps, platform tasks, and proxied connector tools that may create, update, or delete data in connected services such as GitHub, Linear, Slack, Notion, or databases. Bearer API keys grant account-scoped access to platform capabilities; rotate compromised keys immediately and avoid committing tokens to repositories or shared MCP config files. MCP proxy features can discover and call tools on remote MCP servers through inference.sh, which expands the runtime trust boundary beyond the primary client configuration. Tool calls may trigger paid inference, connector actions, storage writes, or outbound network requests according to the selected app or connector. Use least-privilege connector authorization and review each proxied server before enabling it in production agent workflows.Dagu is a workflow engine; MCP access can expose shell commands, Docker containers, Kubernetes Jobs, SSH commands, SQL queries, HTTP calls, agent harnesses, and other workflow steps. The MCP server registers `dagu_change` and `dagu_execute` as destructive-capable tools, so require preview/review before applying DAG changes or starting/stopping production runs. API keys can be scoped to the MCP surface; avoid reusing broad admin credentials for agent access. Workflow edits may change schedules, parameters, retries, secrets usage, queues, resource limits, notifications, and downstream infrastructure actions. Keep the Dagu server and MCP endpoint behind trusted network boundaries, TLS, and authentication for shared or remote deployments.mcp-proxy can expose local stdio MCP servers as network services; keep the host bound to `127.0.0.1` unless remote access is intentional. Passing `--host=0.0.0.0`, permissive CORS, or named-server routes can make tools reachable by other systems on the network. Proxy configuration can include bearer tokens, OAuth client secrets, headers, server commands, environment variables, and working directories. The proxy can spawn arbitrary configured MCP server commands; only use trusted command strings and config files. Remote SSE or Streamable HTTP servers should be authenticated and trusted before forwarding client requests or tool outputs.Nuclear MCP Server runs inside the local Nuclear desktop app and exposes a Streamable HTTP server on the localhost interface. The `call` tool can execute Nuclear API methods after discovery through `list_methods`, `method_details`, and `describe_type`. Available domains include Queue, Playback, Metadata, Favorites, Playlists, Dashboard, and Providers, so agents can change what is playing and modify local music-player state. Nuclear's plugin and provider system can retrieve streaming sources, metadata, playlists, and dashboard content from third-party services; use providers only where automated access is allowed. Keep the server bound to localhost, avoid exposing the MCP endpoint on a network interface, and require confirmation before letting an agent change playlists, favorites, queues, or provider settings.
Privacy notesPrompts, files, tool arguments, task metadata, connector payloads, and model outputs are processed by inference.sh and may transit connected third-party MCP services. OAuth-backed connectors can expose account, workspace, issue, message, or repository content to the agent through proxied tool results. API keys, connector tokens, and task logs should be treated as sensitive credentials and kept out of version control and public issue threads. Hosted execution may retain usage, billing, and operational telemetry according to inference.sh policies; review the platform privacy documentation before processing regulated data.DAG specs, run parameters, logs, documents, audit records, secrets references, API keys, environment variables, and workflow outputs may be exposed to the MCP client. Workflow logs can contain credentials, customer data, internal hostnames, database query results, command output, file paths, or incident context. Dagu stores state locally by default and can also run distributed workers; review where DAG files, logs, audit entries, and secrets are persisted. Any workflow state returned through the MCP client may be sent onward to the configured model provider.MCP requests, responses, tool outputs, progress events, headers, OAuth tokens, API access tokens, and session identifiers may pass through the proxy process. Named server config files can contain command arguments, environment variables, and credentials for downstream MCP servers. Exposed network endpoints can reveal tool names, server status, and results to clients that can reach the proxy. Logs and troubleshooting output may include endpoint URLs, command names, headers, connection errors, or server names. Store config files outside shared repositories when they include tokens or private server details.Tool calls and transcripts can include listening history, search terms, artists, albums, track titles, playlist names, favorites, provider choices, dashboard content, and local player settings. The MCP endpoint is local, but connected MCP clients, model providers, logs, screenshots, and shared chat transcripts can still retain music-library and listening-behavior data. Streaming and metadata providers may receive searches, track identifiers, IP addresses, user-agent metadata, or plugin-specific account context according to their own policies. The MCP server URL and port are local connection details; do not publish screenshots or logs that include private player state or provider credentials.
Prerequisites
  • An inference.sh account and API key from `belt login` or the platform dashboard.
  • An MCP client that supports streamable HTTP transport, such as Claude Code, Cursor, Cline, or Windsurf.
  • Review of which inference.sh apps, connectors, and proxy tools the agent may call before enabling write-capable workflows.
  • A billing and quota plan if the workflow will run image, video, LLM, search, or connector-backed tasks at scale.
  • Dagu installed from Homebrew, GitHub Releases, npm, Docker/GHCR, Helm, or another upstream-supported installation path.
  • A running Dagu HTTP server with the built-in MCP endpoint enabled through the normal server path.
  • MCP client that supports Streamable HTTP server configuration.
  • API key with MCP surface access when Dagu authentication is enabled.
  • Python 3.10 or newer.
  • uv, pipx, or Docker for installation.
  • A known MCP endpoint or local stdio MCP server to bridge.
  • Review of required headers, OAuth client credentials, CORS origins, host, port, and named-server configuration.
  • Nuclear Music Player installed from the project's releases or platform packages.
  • MCP server enabled in Nuclear under Settings > Integrations.
  • MCP client support for Streamable HTTP or remote URL based server configuration.
  • Review of the actual local URL shown by Nuclear because the server starts on ports 8800 through 8809.
Install
Run `belt login` to create an inference.sh API key, then add the streamable HTTP MCP endpoint `https://api.inference.sh/mcp` with `Authorization: Bearer inf_<your_api_key>` in your MCP client settings.
brew install dagu
uv tool install mcp-proxy
claude mcp add nuclear --transport http <copy-url-from-nuclear-settings>
Config
{
  "mcpServers": {
    "inference": {
      "type": "streamable-http",
      "url": "https://api.inference.sh/mcp",
      "headers": {
        "Authorization": "Bearer inf_your_api_key"
      }
    }
  }
}
{
  "mcpServers": {
    "dagu": {
      "url": "LOCAL_DAGU_MCP_URL",
      "headers": {
        "Authorization": "Bearer DAGU_MCP_API_KEY"
      }
    }
  }
}
Manual-only setup:
mcp-proxy https://example.com/sse
Manual-only setup:
claude mcp add nuclear --transport http <copy-url-from-nuclear-settings>
Citations
ClaimUnclaimedUnclaimedUnclaimedUnclaimed

Signals

Loading live community signals…

More like this, weekly

A short, calm digest of reviewed Claude resources. Unsubscribe any time.