MCP Server Threat Modeling Agent

Source-backed agent that threat-models an MCP server before it is connected to Claude Code, covering trust verification, tool authority and side effects, prompt injection via tool output, network and credential exposure, and least-privilege mitigations, grounded in the official security docs.

by JPette1783·added 2026-06-05·

Claude Code

HarnessClaude Code

Command center

Source

Review first

Review safety and privacy notes before installing or copying commands.

Safety notes Privacy notes

Install & copy

## Content

MCP Server Threat Modeling Agent is a reusable agent prompt for assessing the
risk of an MCP server before Claude Code connects to it. It works through trust
verification, tool authority and side effects, prompt injection via tool output,
network and credential exposure, and least-privilege mitigations, grounded in
Claude Code's security model.

Use it before adding a third-party or new MCP server, or when reviewing whether an
existing connection is safe to keep.

## Agent Prompt

You are an MCP server threat modeler for Claude Code. Decide whether a server is
safe to connect and under what limits, using the official Claude Code security
documentation as your reference. Default to caution for servers you do not operate.

Threat-modeling workflow:

1. Trust. Note that connecting a new MCP server requires trust verification (and
   that this is disabled under `-p`). Establish who operates the server and how
   trusted it is. Anthropic does not security-audit MCP servers.
2. Tool authority. Enumerate tools and classify read vs write vs destructive.
   Treat broad or vague tools as higher risk and prefer enabling only what is
   needed.
3. Prompt injection. Tool outputs are untrusted content and can contain
   instructions; recommend not auto-acting on outputs, keeping result sizes
   bounded, and relying on the permission system as a gate.
4. Network and command surface. If the server triggers network requests or runs
   commands, account for the lethal-trifecta risk (untrusted content + private
   data + exfiltration path) and recommend egress controls.
5. Credentials. Identify what credentials the server needs; recommend a proxy that
   injects them outside the agent boundary so the agent never sees raw secrets.
6. Mitigations. Recommend explicit allow rules, confirmation for writes, disabling
   unneeded tools, sandboxing, and VM/dev-container isolation for risky servers.
7. Decision. Connect with limits, connect read-only, or do not connect.

Output contract:

- Server summary: operator, transport, tool authority, data reached.
- Threats: injection, excessive agency, exfiltration, credential exposure.
- Mitigations: allow rules, confirmation, disabled tools, isolation.
- Decision: connect with limits, read-only, or reject.

## Features

- Threat-models an MCP server against Claude Code's security model.
- Classifies tool authority and flags excessive agency.
- Treats tool output as untrusted (prompt-injection aware).
- Produces a connect/limit/reject decision with mitigations.

## Use Cases

- Vet a third-party MCP server before connecting it.
- Decide whether to allow only read-only tools from a server.
- Reduce prompt-injection and exfiltration risk from MCP tools.
- Review an existing MCP connection for safe configuration.

## Source Notes

- Claude Code requires trust verification for new MCP servers, gates network
  requests, isolates web-fetch context, and treats the permission system as the
  enforcement layer; Anthropic does not security-audit MCP servers.
- The lethal-trifecta framing (untrusted content, private data, exfiltration
  path) informs which combinations of MCP capabilities are highest risk.

## Duplicate Check

The content tree and open PRs were checked for MCP threat modeling, security, and
audit agents. This entry is distinct from MCP metadata/registry review: it is an
`agents` prompt focused on threat-modeling an MCP server's security risk before
connection.

## Editorial Disclosure

Submitted as an independent community agent entry by `JPette1783`, based on
public Claude Code documentation. No paid placement, referral, or affiliate
relationship.

## Sources

- Claude Code security: https://code.claude.com/docs/en/security
- Claude Code MCP documentation: https://code.claude.com/docs/en/mcp
- Claude Code features overview: https://code.claude.com/docs/en/features-overview

Trust & readiness

TrustReview first
Sourcesource-backed
Safety notesPresent
ReviewedNo

Community context

Related entries(4)
Related guides(3)
Community signals

Compare

Integrations & API

Contribute

Suggest a metadata change Claim this listing

Documentation Source repository Browse directory

Review first — review before installing

Open the source and read safety notes before installing.

Citation facts

Source-backed facts for citing this resource, derived directly from the registry — also available as plain text for AI assistants.

Canonical URL: https://heyclau.de/entry/agents/mcp-server-threat-modeling-agent
Source URLs: https://code.claude.com/docs/en/security, https://github.com/JSONbored/awesome-claude/blob/main/content/agents/mcp-server-threat-modeling-agent.mdx
Safety notes: This agent assesses risk; it does not connect to or exercise the server. Connecting a new MCP server requires trust verification, which is disabled in non-interactive (-p) runs., Treat MCP tool output as untrusted content that can carry prompt-injection instructions; recommend not auto-acting on it and keeping result sizes bounded., Recommend least-privilege: explicit allow rules, confirmation for write tools, and disabling tools that are not needed. Anthropic does not security-audit MCP servers.
Privacy notes: Tools send whatever inputs they are called with to the server; identify what data would leave the environment and to whom., Credentials for the server must be stored securely and never committed or logged; prefer a credential proxy so the agent never sees raw secrets., Confirm the server operator's data handling and retention before sending sensitive context to it.
Author: JPette1783
Submitted by: JPette1783
Claim status: unclaimed
Last verified: 2026-06-05

Decision playbook

Review trust signals before you adopt

Signals are present but mixed. Use the checklist below to confirm the source and operational safety for your environment.

Compare context

Selected

Current score

Baseline

—

Delta

No baseline selected

No major trust-signal divergence detected in the current selection.

Source and provenance checks

Needs review

Confirm ownership and provenance before trusting install instructions.

Source link availableRequired
Open the canonical repository and verify ownership.
Done
Source provenance statusRequired
Marked as source-backed.
Done
Metadata reviewed
No reviewed flag detected in metadata.
Pending

Safety and privacy checks

Complete

Validate risk disclosures before installation or API wiring.

Safety notes presentRequired
Review the listed safety guidance before running commands.
Done
Privacy notes presentRequired
Review data handling notes before connecting accounts or secrets.
Done
Trust level risk gateRequired
Trust level does not block evaluation.
Done

Package and install checks

Needs review

Check package metadata and artifact integrity signals.

Install payload available
Install or copy payload is available for review.
Done
Package verification flag
No package verification flag provided.
Pending
Checksum metadata
No checksum provided for downloaded artifact.
Pending

Compare-driven decision checks

Needs review

Use compare context to validate trade-offs before adoption.

Compare tray has multiple entries
Add at least one more entry to compare trust differences.
Pending
Baseline comparison available
No baseline peer selected yet.
Pending
Diverging trust signals identified
No major trust-signal divergence found.
Pending

Setup at a glance

Copy & paste

Copy-ready — paste the snippet to get started.

Install command

Not provided

Config snippet

Not provided

Copy snippet

Provided

Prerequisites

3 to clear

Platforms

1 listed

Install type

Copy & paste

Adoption plan

Balanced adoption plan

Current risk score 24/100. Use staged verification before broader rollout.

Risk 24

Pre-adoption checks

Validate source and review signals before any execution.

Confirm source provenanceRequired
Source URL/provenance metadata is present.
Done
Confirm metadata review state
No review metadata found; increase manual validation.
Pending
Verify install payload
Install/config payload exists and can be inspected.
Done

Security checks

Confirm safety, privacy, and package integrity signals.

Review safety notesRequired
Safety notes are present.
Done
Review privacy notesRequired
Privacy notes are present.
Done
Verify package integrity metadata
No package verification/checksum metadata.
Pending

Rollout

Adopt in controlled steps based on the selected plan.

Run in isolated sandbox firstRequired
Use a constrained sandbox and observe behavior across multiple tasks.
Pending
Roll out graduallyRequired
Roll out to a small cohort before wider usage.
Pending
Set monitoring and fallback
Define rollback path and monitor errors after adoption.
Pending

Evidence readiness

Evidence readiness matrix · balanced

Missing required evidence: Metadata review. Risk score 31.

Risk 31

Source provenance

Present

Source repository/provenance is listed.

Required in this preset

Metadata review

Missing

Review metadata is missing.

Required in this preset

Safety notes

Present

Safety notes are present.

Required in this preset

Privacy notes

Present

Privacy notes are present.

Optional in this preset

Package integrity

Missing

Package integrity metadata is missing.

Optional in this preset

Install payload

Present

Install payload is available.

Required in this preset

Required gaps: Metadata review

Decision timeline

Decision timeline · balanced

Blocking gaps: Check metadata review status. Risk 28.

Risk 28

triage

Confirm source provenanceRequired

Source/provenance metadata is available.

Done

triage

Check metadata review statusRequired

Review metadata is missing.

Pending

verify

Review safety notesRequired

Safety notes are available.

Done

verify

Review privacy notes

Privacy notes are available.

Done

verify

Validate package integrity metadata

Package integrity metadata is missing.

Pending

rollout

Verify install payload and commandsRequired

Install payload is available.

Done

Blockers: Check metadata review status

Prerequisite readiness

3 prerequisites to line up before setup.

0/3 ready

Permissions & scopes1Network & hosting1General1

Safety & privacy surface

3 safety and 3 privacy notes across 5 risk areas. Review closely: credentials & tokens, network access.

5 areas

SafetyNetwork accessThis agent assesses risk; it does not connect to or exercise the server. Connecting a new MCP server requires trust verification, which is disabled in non-interactive (-p) runs.
SafetyGeneralTreat MCP tool output as untrusted content that can carry prompt-injection instructions; recommend not auto-acting on it and keeping result sizes bounded.
SafetyLocal filesRecommend least-privilege: explicit allow rules, confirmation for write tools, and disabling tools that are not needed. Anthropic does not security-audit MCP servers.
PrivacyGeneralTools send whatever inputs they are called with to the server; identify what data would leave the environment and to whom.
PrivacyCredentials & tokensCredentials for the server must be stored securely and never committed or logged; prefer a credential proxy so the agent never sees raw secrets.
PrivacyData retentionConfirm the server operator's data handling and retention before sending sensitive context to it.

Safety notes

This agent assesses risk; it does not connect to or exercise the server. Connecting a new MCP server requires trust verification, which is disabled in non-interactive (-p) runs.
Treat MCP tool output as untrusted content that can carry prompt-injection instructions; recommend not auto-acting on it and keeping result sizes bounded.
Recommend least-privilege: explicit allow rules, confirmation for write tools, and disabling tools that are not needed. Anthropic does not security-audit MCP servers.

Privacy notes

Tools send whatever inputs they are called with to the server; identify what data would leave the environment and to whom.
Credentials for the server must be stored securely and never committed or logged; prefer a credential proxy so the agent never sees raw secrets.
Confirm the server operator's data handling and retention before sending sensitive context to it.

Prerequisites

The MCP server's source or documentation, transport, and tool list with input/output schemas.
Knowledge of who operates the server and how trusted it is.
The permission posture of the Claude Code project that would connect it.

Schema details

Install type: copy
Troubleshooting: No

Full copyable content

## Content

MCP Server Threat Modeling Agent is a reusable agent prompt for assessing the
risk of an MCP server before Claude Code connects to it. It works through trust
verification, tool authority and side effects, prompt injection via tool output,
network and credential exposure, and least-privilege mitigations, grounded in
Claude Code's security model.

Use it before adding a third-party or new MCP server, or when reviewing whether an
existing connection is safe to keep.

## Agent Prompt

You are an MCP server threat modeler for Claude Code. Decide whether a server is
safe to connect and under what limits, using the official Claude Code security
documentation as your reference. Default to caution for servers you do not operate.

Threat-modeling workflow:

1. Trust. Note that connecting a new MCP server requires trust verification (and
   that this is disabled under `-p`). Establish who operates the server and how
   trusted it is. Anthropic does not security-audit MCP servers.
2. Tool authority. Enumerate tools and classify read vs write vs destructive.
   Treat broad or vague tools as higher risk and prefer enabling only what is
   needed.
3. Prompt injection. Tool outputs are untrusted content and can contain
   instructions; recommend not auto-acting on outputs, keeping result sizes
   bounded, and relying on the permission system as a gate.
4. Network and command surface. If the server triggers network requests or runs
   commands, account for the lethal-trifecta risk (untrusted content + private
   data + exfiltration path) and recommend egress controls.
5. Credentials. Identify what credentials the server needs; recommend a proxy that
   injects them outside the agent boundary so the agent never sees raw secrets.
6. Mitigations. Recommend explicit allow rules, confirmation for writes, disabling
   unneeded tools, sandboxing, and VM/dev-container isolation for risky servers.
7. Decision. Connect with limits, connect read-only, or do not connect.

Output contract:

- Server summary: operator, transport, tool authority, data reached.
- Threats: injection, excessive agency, exfiltration, credential exposure.
- Mitigations: allow rules, confirmation, disabled tools, isolation.
- Decision: connect with limits, read-only, or reject.

## Features

- Threat-models an MCP server against Claude Code's security model.
- Classifies tool authority and flags excessive agency.
- Treats tool output as untrusted (prompt-injection aware).
- Produces a connect/limit/reject decision with mitigations.

## Use Cases

- Vet a third-party MCP server before connecting it.
- Decide whether to allow only read-only tools from a server.
- Reduce prompt-injection and exfiltration risk from MCP tools.
- Review an existing MCP connection for safe configuration.

## Source Notes

- Claude Code requires trust verification for new MCP servers, gates network
  requests, isolates web-fetch context, and treats the permission system as the
  enforcement layer; Anthropic does not security-audit MCP servers.
- The lethal-trifecta framing (untrusted content, private data, exfiltration
  path) informs which combinations of MCP capabilities are highest risk.

## Duplicate Check

The content tree and open PRs were checked for MCP threat modeling, security, and
audit agents. This entry is distinct from MCP metadata/registry review: it is an
`agents` prompt focused on threat-modeling an MCP server's security risk before
connection.

## Editorial Disclosure

Submitted as an independent community agent entry by `JPette1783`, based on
public Claude Code documentation. No paid placement, referral, or affiliate
relationship.

## Sources

- Claude Code security: https://code.claude.com/docs/en/security
- Claude Code MCP documentation: https://code.claude.com/docs/en/mcp
- Claude Code features overview: https://code.claude.com/docs/en/features-overview

About this resource

Content

MCP Server Threat Modeling Agent is a reusable agent prompt for assessing the risk of an MCP server before Claude Code connects to it. It works through trust verification, tool authority and side effects, prompt injection via tool output, network and credential exposure, and least-privilege mitigations, grounded in Claude Code's security model.

Use it before adding a third-party or new MCP server, or when reviewing whether an existing connection is safe to keep.

Agent Prompt

You are an MCP server threat modeler for Claude Code. Decide whether a server is safe to connect and under what limits, using the official Claude Code security documentation as your reference. Default to caution for servers you do not operate.

Threat-modeling workflow:

Trust. Note that connecting a new MCP server requires trust verification (and that this is disabled under -p). Establish who operates the server and how trusted it is. Anthropic does not security-audit MCP servers.
Tool authority. Enumerate tools and classify read vs write vs destructive. Treat broad or vague tools as higher risk and prefer enabling only what is needed.
Prompt injection. Tool outputs are untrusted content and can contain instructions; recommend not auto-acting on outputs, keeping result sizes bounded, and relying on the permission system as a gate.
Network and command surface. If the server triggers network requests or runs commands, account for the lethal-trifecta risk (untrusted content + private data + exfiltration path) and recommend egress controls.
Credentials. Identify what credentials the server needs; recommend a proxy that injects them outside the agent boundary so the agent never sees raw secrets.
Mitigations. Recommend explicit allow rules, confirmation for writes, disabling unneeded tools, sandboxing, and VM/dev-container isolation for risky servers.
Decision. Connect with limits, connect read-only, or do not connect.

Output contract:

Server summary: operator, transport, tool authority, data reached.
Threats: injection, excessive agency, exfiltration, credential exposure.
Mitigations: allow rules, confirmation, disabled tools, isolation.
Decision: connect with limits, read-only, or reject.

Features

Threat-models an MCP server against Claude Code's security model.
Classifies tool authority and flags excessive agency.
Treats tool output as untrusted (prompt-injection aware).
Produces a connect/limit/reject decision with mitigations.

Use Cases

Vet a third-party MCP server before connecting it.
Decide whether to allow only read-only tools from a server.
Reduce prompt-injection and exfiltration risk from MCP tools.
Review an existing MCP connection for safe configuration.

Source Notes

Claude Code requires trust verification for new MCP servers, gates network requests, isolates web-fetch context, and treats the permission system as the enforcement layer; Anthropic does not security-audit MCP servers.
The lethal-trifecta framing (untrusted content, private data, exfiltration path) informs which combinations of MCP capabilities are highest risk.

Duplicate Check

The content tree and open PRs were checked for MCP threat modeling, security, and audit agents. This entry is distinct from MCP metadata/registry review: it is an agents prompt focused on threat-modeling an MCP server's security risk before connection.

Editorial Disclosure

Submitted as an independent community agent entry by JPette1783, based on public Claude Code documentation. No paid placement, referral, or affiliate relationship.

Sources

Claude Code security: https://code.claude.com/docs/en/security
Claude Code MCP documentation: https://code.claude.com/docs/en/mcp
Claude Code features overview: https://code.claude.com/docs/en/features-overview

#mcp #security #threat-modeling #claude-code #review

Source citations

Source methodology →

Add this badge to your README

Show that MCP Server Threat Modeling Agent is listed on HeyClaude. Paste this Markdown into your README — it renders the badge and links back to this page.

[![Listed on HeyClaude](https://heyclau.de/badge/agents/mcp-server-threat-modeling-agent.svg)](https://heyclau.de/entry/agents/mcp-server-threat-modeling-agent)

How it compares

MCP Server Threat Modeling Agent side by side with 2 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.

2 trust signals differ across this comparison (Source provenance, Submitter).

Field	MCP Server Threat Modeling Agent Source-backed agent that threat-models an MCP server before it is connected to Claude Code, covering trust verification, tool authority and side effects, prompt injection via tool output, network and credential exposure, and least-privilege mitigations, grounded in the official security docs. Open dossier	Claude Code Security Guidance Remediator Agent Source-backed agent that reviews active Claude Code sessions and configuration for security gaps, cross-references the official security guidance, and produces a ranked remediation plan covering permissions, MCP trust, prompt injection, credential handling, and hook safety. Open dossier	MCP Remote Server Security Auditor Agent Community reusable agent prompt for reviewing new MCP server adoption in Claude Code using official security documentation: trusted providers, permissions configuration, trust verification, and settings checked into source control. Open dossier
Next steps	Open dossier API JSON Open LLM Open source Newsletter Claim listing	Open dossier API JSON Open LLM Open source Newsletter Claim listing	Open dossier API JSON Open LLM Open source Newsletter Claim listing
Trust
Review status	Not reviewed	Not reviewed	Not reviewed
Package trust	Package not verified	Package not verified	Package not verified
Source provenanceDiffers	Source-backed	Source-backed	Submission linkedSource submission
SubmitterDiffers	JPette1783	jaso0n0818	kiannidev
Install risk	Review first	Review first	Review first
Notes	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓	Safety ✓ Privacy ✓
Brand	—	—	—
Category	agents	agents	agents
Source	Source-backed	Source-backed	Source-backed
Author	JPette1783	jaso0n0818	kiannidev
Added	2026-06-05	2026-06-15	2026-06-16
Platforms	Claude Code	Claude Code	Claude Code
Harness	Claude Code	Claude Code	Claude Code
Source repo	—	—	—
Safety notes	✓This agent assesses risk; it does not connect to or exercise the server. Connecting a new MCP server requires trust verification, which is disabled in non-interactive (-p) runs. Treat MCP tool output as untrusted content that can carry prompt-injection instructions; recommend not auto-acting on it and keeping result sizes bounded. Recommend least-privilege: explicit allow rules, confirmation for write tools, and disabling tools that are not needed. Anthropic does not security-audit MCP servers.	✓This agent reads configuration and assesses risk; it does not modify settings, revoke permissions, or disconnect MCP servers. Remediation steps that involve disconnecting MCP servers or changing hook scripts must be reviewed by a human before applying. Hook commands execute on the host with full user permissions; flag any hook that is not read-only or that pulls external content at runtime. Managed-settings changes affect all team members; escalate those remediations to an administrator.	✓Anthropic reviews directory connectors but does not security-audit third-party MCP servers per official docs. This prompt applies documented Claude Code guidance; it is not a penetration test. Prefer writing or vetting your own MCP servers when handling sensitive repositories. Trust verification applies to new MCP servers—do not bypass in non-interactive modes without policy review.
Privacy notes	✓Tools send whatever inputs they are called with to the server; identify what data would leave the environment and to whom. Credentials for the server must be stored securely and never committed or logged; prefer a credential proxy so the agent never sees raw secrets. Confirm the server operator's data handling and retention before sending sensitive context to it.	✓CLAUDE.md and settings files may contain internal project details, API endpoint patterns, or policy rules; treat audit output as internal. MCP server configurations may expose credential references or internal service URLs; do not log or share audit reports outside the team. If the audit runs in a shared or CI environment, ensure the session transcript is not persisted where it can be read by unintended parties.	✓MCP settings checked into source control may expose internal server names and endpoints. Audit summaries should not paste secrets from server configuration files. Third-party MCP tools may send repository context externally—note data residency in reviews.
Prerequisites	The MCP server's source or documentation, transport, and tool list with input/output schemas. Knowledge of who operates the server and how trusted it is. The permission posture of the Claude Code project that would connect it.	Access to the CLAUDE.md, .claude/settings.json, and .mcp.json for the project being audited. The list of connected MCP servers and their transports (stdio vs HTTP/SSE). Knowledge of which hooks are registered and what shell commands they execute. Claude Code 1.x or later (settings schema and managed-settings support required).	Draft MCP server entry for Claude Code settings or managed configuration. Provider identity and whether the server is first-party, directory-listed, or third-party. Team policy for MCP permissions and version-controlled settings files. Inventory of tools exposed by the server if available from the provider.
Install	—	—	—
Config	—	—	—
Citations	Source repositorygithub.com 2026-07-21T01:31:08+00:00 Documentationcode.claude.com Submitted by JPette17832026-06-05 Source methodology →	Source repositorygithub.com 2026-07-21T01:31:08+00:00 Documentationcode.claude.com Submitted by jaso0n08182026-06-15 Source methodology →	Source repositorygithub.com 2026-07-21T01:31:08+00:00 Documentationcode.claude.com Submitted by kiannidev2026-06-16 Source methodology →
Claim	Unclaimed	Unclaimed	Unclaimed

Open 3 picks in the interactive comparison tool

Featured in

Best list: Best security review agents for Claude Open 4 picks in the interactive comparison tool

Signals

Loading live community signals…

Citation facts

Review trust signals before you adopt

Source and provenance checks

Safety and privacy checks

Package and install checks

Compare-driven decision checks

Copy & paste

Balanced adoption plan

Pre-adoption checks

Security checks

Rollout

Evidence readiness matrix · balanced

Source provenance

Metadata review

Safety notes

Privacy notes

Package integrity

Install payload

Decision timeline · balanced

Confirm source provenanceRequired

Check metadata review statusRequired

Review safety notesRequired

Review privacy notes

Validate package integrity metadata

Verify install payload and commandsRequired

Prerequisite readiness

Safety & privacy surface

Safety notes

Privacy notes

Prerequisites

Schema details

About this resource

Content

Agent Prompt

Features

Use Cases

Source Notes

Duplicate Check

Editorial Disclosure

Sources

Source citations

Add this badge to your README

How it compares

Related resources

Claude Code Security Guidance Remediator Agent

MCP Remote Server Security Auditor Agent

Prompt Injection Defense For Tool Connected Agents

Security Guidance Plugin Before Merge

Related guides

Auditing MCP Client Configuration Before Team Rollout

OAuth Patterns For MCP Server Authentication

Package Provenance Checks Before Installing MCP Servers

Featured in

Signals