Context Engineering Agent Skills
MIT-licensed Agent Skills collection for context engineering, harness engineering, multi-agent architectures, filesystem context, memory systems, tool design, evaluation, hosted agents, and production agent operating loops for Claude Code, Cursor, Codex, and Open Plugins-compatible agent tools.
Open the source and read safety notes before installing.
Safety notes
- These skills alter how agents select context, delegate work, persist state, design tools, evaluate outputs, and operate autonomous loops; use them as engineering guidance, not as automatic authority to change a production agent system.
- Filesystem-context and memory-system patterns can cause agents to write durable plans, scratchpads, logs, summaries, preferences, or shared handoff files. Keep cleanup, ownership, and review rules explicit.
- Harness-engineering, hosted-agent, and evaluation workflows can launch long-running loops, background agents, benchmark suites, paid model calls, or remote sandbox work. Require budgets, kill switches, rollback rules, and approval gates.
- Tool-design guidance can change MCP schemas, tool descriptions, return formats, and error contracts. Test routing and compatibility before deploying changes to users.
- Benchmark results are source evidence for this repository's claims, but they are workload-specific. Re-run or adapt benchmarks before relying on the reported routing numbers in a different agent stack.
Privacy notes
- Context-engineering work often touches prompts, system instructions, tool definitions, retrieved documents, message history, tool outputs, logs, scratch files, memory stores, benchmark prompts, and model responses.
- Do not persist secrets, customer data, private source code, incident data, unpublished strategy, or regulated records into scratchpads, skill examples, benchmark fixtures, or shared agent workspaces.
- If benchmark runners or hosted-agent examples call external models or remote sandboxes, review what prompts, traces, files, and logs are sent outside the local workspace.
- Agent memory and filesystem-context patterns should include deletion, redaction, retention, and access-control rules before being used with private projects.
Prerequisites
- Claude Code plugin support, Cursor Open Plugins support, or an agent host that can load Agent Skills or custom instruction files.
- A project where context-window behavior, multi-agent structure, tool design, memory, evaluation, or harness reliability is a real design concern.
- A version-controlled workspace for any scripts, examples, benchmark artifacts, or generated skill changes.
- Human review before applying benchmark-derived skill changes, modifying persistent agent memory, changing tool schemas, or deploying autonomous harness loops.
Schema details
- Install type
- package
- Reading time
- 7 min
- Difficulty score
- 84
- Troubleshooting
- Yes
- Breaking changes
- No
- Scope
- Source repo
- Skill type
- capability-pack
- Skill level
- expert
- Verification
- validated
- Verified at
- 2026-06-18
| Platform | Support | Install path |
|---|---|---|
| claude-code | Native | .claude/skills/<skill-name>/SKILL.md |
| codex | Native | .agents/skills/<skill-name>/SKILL.md |
| windsurf | Native | .windsurf/skills/<skill-name>/SKILL.md |
| gemini | Native | .gemini/skills/<skill-name>/SKILL.md or .agents/skills/<skill-name>/SKILL.md |
| cursor | Adapter | .cursor/rules/<skill-name>.mdc |
| cli | Manual | AGENTS.md or tool-specific context file |
Full copyable content
/plugin marketplace add muratcankoylan/Agent-Skills-for-Context-Engineering
/plugin install context-engineering@context-engineering-marketplaceAbout this resource
Context Engineering Agent Skills
Context Engineering Agent Skills is a 15-skill collection for designing, debugging, and operating production-grade AI agent systems. It focuses on what enters the model context, how agents route work to skills and tools, how multi-agent systems isolate state, and how durable files, memory, evaluation, and harnesses keep long-running agent work coherent.
Use this listing for the skills collection itself. Use separate entries when reviewing a specific MCP server, hosted-agent runtime, evaluation framework, or agent platform.
Knowledge Freshness
Verified on 2026-06-18, the repository reported plugin version 2.3.0,
latest GitHub release v2.3.0 published on 2026-05-22, MIT licensing, and
recent GitHub activity in late May 2026 with live repository updates visible
on 2026-06-18.
The pack includes benchmark artifacts and measured routing results, but those numbers are tied to the repository's fixtures, models, prompts, and runner. Treat them as useful source evidence, not universal performance guarantees.
Retrieval Sources
This listing is grounded in:
- The upstream README and root collection
SKILL.md. - The Open Plugins manifest for the
context-engineeringplugin. - Representative
context-fundamentals,filesystem-context, andtool-designskill files. - The published 2026-05-19 router benchmark report.
- Current GitHub repository metadata for license, stars, release, and activity.
Core Workflow
Claude Code users can install the collection as a plugin:
/plugin marketplace add muratcankoylan/Agent-Skills-for-Context-Engineering
/plugin install context-engineering@context-engineering-marketplace
For a smaller install, copy only the individual skill folder needed for the
current task into the target agent host's skills or custom-instructions
directory. The README lists skills such as context-fundamentals,
context-degradation, context-compression, filesystem-context,
multi-agent-patterns, memory-systems, tool-design, evaluation,
advanced-evaluation, harness-engineering, hosted-agents,
project-development, and bdi-mental-states.
Capability Scope
| Area | Coverage |
|---|---|
| Context fundamentals | Context-window anatomy, attention budgets, U-shaped attention, progressive disclosure, and signal-density reasoning |
| Context failure and compression | Lost-in-middle behavior, context poisoning, distraction, compaction, summarization, masking, and context optimization |
| Filesystem context | Scratchpads, tool-output offloading, plan persistence, sub-agent handoff files, terminal logs, and dynamic skill loading |
| Multi-agent systems | Orchestrator, peer-to-peer, hierarchical, hosted-agent, and sandboxed background-agent patterns |
| Memory systems | Short-term memory, long-term memory, knowledge graphs, retrieval semantics, and persistent workspace state |
| Tool and MCP design | Tool descriptions, schemas, return formats, error recovery, namespacing, consolidation, and MCP tool naming |
| Evaluation and harnesses | Deterministic checks, LLM-as-judge design, rubrics, novelty gates, durable logs, rollback rules, and human approval boundaries |
Use Cases
- Diagnose context-window failures in a long-running coding or research agent.
- Design a filesystem-backed scratchpad, handoff, or memory layer for agent workflows.
- Reduce an agent tool catalog to clearer MCP or native tool contracts.
- Decide when sub-agents are useful for context isolation instead of only role simulation.
- Build deterministic evaluation and regression gates for an agent system.
- Design autonomous harnesses with locked metrics, budgets, logs, rollback rules, and approval checkpoints.
Production Rules
- Load the full skill only when the task directly matches its trigger; broad context stuffing defeats the purpose of a context-engineering pack.
- Keep context, memory, and filesystem artifacts scoped to the task and readable by humans.
- Review and test tool-schema or MCP changes before deploying them to agent users.
- Add budgets and hard stops before benchmark runners, hosted-agent loops, model calls, or autonomous research loops.
- Treat benchmark results as repo-specific evidence and revalidate on your own model, prompt set, tool catalog, and agent runtime before making product claims.
Source Review
Verified on 2026-06-18:
- GitHub metadata reported
muratcankoylan/Agent-Skills-for-Context-Engineeringas an MIT-licensed repository with more than 16,000 stars, default branchmain, latest releasev2.3.0, and recent repository activity. - The README described a 15-skill collection across context fundamentals, context degradation, compression, optimization, latent briefing, multi-agent patterns, memory systems, tool design, filesystem context, hosted agents, evaluation, advanced evaluation, harness engineering, project development, and BDI mental states.
.plugin/plugin.jsondeclared plugin namecontext-engineering, version2.3.0, authorMuratcan Koylan, and a description covering context engineering, harness engineering, multi-agent coordination, memory systems, tool design, evaluation, autonomous harnesses, and measured router benchmark results.- The root
SKILL.mddescribed the collection as platform-agnostic and oriented around building, optimizing, evaluating, and debugging agent systems with effective context management. skills/filesystem-context/SKILL.mddocumented scratchpads, tool-output offloading, plan persistence, sub-agent communication through files, dynamic skill loading, and terminal/log persistence.skills/tool-design/SKILL.mddocumented tool descriptions, schema design, response formats, actionable errors, MCP tool namespacing, and tool catalog consolidation.- The 2026-05-19 router benchmark report documented 600 runs across four models with 600/600 usable records, 0 format failures, and reported per-model top-1 and top-3 routing accuracy for the repository's fixture.
Source citations
Add this badge to your README
Show that Context Engineering Agent Skills is listed on HeyClaude. Paste this Markdown into your README — it renders the badge and links back to this page.
[](https://heyclau.de/entry/skills/context-engineering-agent-skills)How it compares
Context Engineering Agent Skills side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.
| Field | MIT-licensed Agent Skills collection for context engineering, harness engineering, multi-agent architectures, filesystem context, memory systems, tool design, evaluation, hosted agents, and production agent operating loops for Claude Code, Cursor, Codex, and Open Plugins-compatible agent tools. Open dossier | MIT-licensed Agent Skill for persistent file-based planning across Claude Code, Codex, Cursor, Gemini CLI, OpenCode, Hermes Agent, OpenClaw, Kiro, and other SKILL.md-compatible coding agents, with task_plan.md, findings.md, progress.md, hooks, session recovery, attestation, and opt-in long-running run modes. Open dossier | Multi-harness agentic plugin marketplace with 84 plugins, 192 agents, 156 skills, 102 commands, and 16 orchestrators for Claude Code, Codex CLI, Cursor, OpenCode, Gemini CLI, and GitHub Copilot from one Markdown source tree. Open dossier | Addy Osmani's production-grade Agent Skills pack for AI coding agents, with lifecycle slash commands, engineering workflow skills, review personas, quality gates, and cross-agent setup guidance for Claude Code, Cursor, Gemini CLI, Antigravity CLI, OpenCode, GitHub Copilot, and other agents. Open dossier |
|---|---|---|---|---|
| Trust | ||||
| Install risk | Review first | Review first | Review first | Review first |
| Notes | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ |
| Brand | ||||
| Category | skills | skills | skills | skills |
| Source | source-backed | source-backed | source-backed | source-backed |
| Author | Muratcan Koylan | Ahmad Othman Ammar Adi | Seth Hobson | Addy Osmani |
| Added | 2026-06-18 | 2026-06-18 | 2026-06-18 | 2026-06-18 |
| Platforms | Claude CodeCodexWindsurfGeminiCursorCLI | Claude CodeCodexWindsurfGeminiCursorCLIContinue | Claude CodeCodexWindsurfGeminiCursorCLI | Claude CodeCodexWindsurfGeminiCursorCLI |
| Source repo | — | — | — | — |
| Safety notes | ✓These skills alter how agents select context, delegate work, persist state, design tools, evaluate outputs, and operate autonomous loops; use them as engineering guidance, not as automatic authority to change a production agent system. Filesystem-context and memory-system patterns can cause agents to write durable plans, scratchpads, logs, summaries, preferences, or shared handoff files. Keep cleanup, ownership, and review rules explicit. Harness-engineering, hosted-agent, and evaluation workflows can launch long-running loops, background agents, benchmark suites, paid model calls, or remote sandbox work. Require budgets, kill switches, rollback rules, and approval gates. Tool-design guidance can change MCP schemas, tool descriptions, return formats, and error contracts. Test routing and compatibility before deploying changes to users. Benchmark results are source evidence for this repository's claims, but they are workload-specific. Re-run or adapt benchmarks before relying on the reported routing numbers in a different agent stack. | ✓Planning with Files writes persistent planning state into the active project. Review whether `task_plan.md`, `findings.md`, `progress.md`, `.planning/`, and handoff files should be committed, ignored, or scrubbed before sharing. The skill uses hooks and helper scripts to re-inject plan context, remind the agent to update progress, run session catchup, and check completion. Inspect the installed hook scripts before enabling them in shared repositories or global agent config. Planning files become model context. Do not paste untrusted web content, issue comments, logs, or external instructions into planning files without summarizing and neutralizing prompt-injection text. The upstream eval notes describe a prior prompt-injection amplification risk from web fetch/search content being written into planning files and re-read by hooks; current SKILL.md removes WebFetch/WebSearch from allowed tools and documents the security boundary. Codex users should merge hook entries into existing hook configs rather than overwriting them, and avoid enabling duplicate workspace plus global hooks that would run twice. Autonomous and gated modes are opt-in for long-running runs. Understand whether the host can hard-block, follow up, or only notify before relying on a completion gate. | ✓This marketplace installs executable agent workflow instructions, slash commands, skills, and agent profiles. Review each selected plugin before giving it write, shell, network, MCP, cloud, browser, or deployment access. Some plugins target security scans, infrastructure, Kubernetes, cloud architecture, CI/CD, incident response, dependency management, and multi-agent orchestration; keep destructive, expensive, or production-impacting commands behind human approval. The marketplace includes external git-subdir plugin entries as well as local plugins. Verify external source repositories, manifests, licenses, update cadence, and dependency behavior separately. Cross-harness adapters intentionally transform source artifacts for Codex, Cursor, OpenCode, Gemini, and Copilot; inspect generated artifacts when relying on tool allowlists, model mappings, command conversion, or skill size limits. Do not assume every plugin is appropriate for every repository, compliance environment, model provider, or agent sandbox. | ✓The slash commands are designed to guide real coding, testing, reviewing, committing, and shipping work; keep edits, commits, pushes, CI changes, and deploys behind the host's normal approval controls. `/build auto` is explicitly intended to generate a plan and implement multiple tasks in one approved pass. Use it on bounded specs, review the generated plan first, and stop on test failures or risky changes. The skills encode durable engineering workflows, not guaranteed-current framework APIs. Follow the source-driven-development guidance and verify current documentation before applying generated code. Security, CI/CD, observability, migration, and launch skills can touch production-sensitive systems. Require dry-run plans, rollback notes, and environment scoping before approving operational commands. Review personas and quality gates are useful second opinions, but they do not replace maintainer review, domain-specific tests, threat modeling, or release sign-off. |
| Privacy notes | ✓Context-engineering work often touches prompts, system instructions, tool definitions, retrieved documents, message history, tool outputs, logs, scratch files, memory stores, benchmark prompts, and model responses. Do not persist secrets, customer data, private source code, incident data, unpublished strategy, or regulated records into scratchpads, skill examples, benchmark fixtures, or shared agent workspaces. If benchmark runners or hosted-agent examples call external models or remote sandboxes, review what prompts, traces, files, and logs are sent outside the local workspace. Agent memory and filesystem-context patterns should include deletion, redaction, retention, and access-control rules before being used with private projects. | ✓The planning files can contain task goals, source paths, branch names, PR URLs, test output, error logs, research findings, product decisions, customer context, security findings, and operational handoff details. Session-catchup workflows inspect local agent session stores such as Claude Code project history, Codex sessions, or other host-specific stores to recover context after `/clear` or compaction. Hook-injected planning content is sent to the active model provider as part of the agent context. Keep secrets, access tokens, private incident data, customer records, and unreleased roadmap details out of planning files unless the provider and retention policy are approved. Attestation stores hashes of plan content for tamper detection, but it does not encrypt the planning files or make their content safe to publish. | ✓Installed plugins can expose project files, source code, issues, pull requests, logs, architecture notes, prompts, tool outputs, credentials accidentally present in context, cloud resource names, deployment details, and incident data to the configured agent runtime. MCP, memory, browser, cloud, and external plugin integrations may send prompts, file snippets, traces, or task context to additional local or remote services depending on configuration. Generated Codex, Cursor, OpenCode, Gemini, and Copilot artifacts may persist agent instructions, skills, commands, and project assumptions on disk. Keep secrets, customer data, regulated records, private infrastructure details, and unreleased business or incident material out of public plugin configs, examples, issues, PRs, screenshots, and generated docs. | ✓Using the pack with an AI agent can expose repository code, product requirements, architecture notes, tests, CI logs, deployment settings, incidents, security findings, and launch plans to the configured model provider. Do not paste secrets, customer data, private incident records, production credentials, unpublished roadmap details, or proprietary compliance material into public prompts, issues, screenshots, or PR bodies. Agent personas and review workflows may ask for browser traces, performance data, logs, build output, dependency lists, and environment details; redact tokens and private URLs before sharing artifacts. |
| Prerequisites |
|
|
|
|
| Install | | | | |
| Config | — | | — | — |
| Citations | ||||
| Claim | Unclaimed | Unclaimed | Unclaimed | Unclaimed |
Signals
Loading live community signals…
A short, calm digest of reviewed Claude resources. Unsubscribe any time.