ARIS Auto-Research-In-Sleep
ARIS is a Markdown-only skill workflow pack for autonomous ML research agents, with idea discovery, experiment planning, auto-review loops, paper writing, rebuttal, resubmission, slides, posters, Research Wiki, and cross-model reviewer workflows for Claude Code, Codex, OpenClaw, Cursor, and other agent hosts.
Open the source and read safety notes before installing.
Safety notes
- ARIS skills can guide agents through code changes, experiment planning, experiment execution, paper drafting, rebuttal drafting, and cross-model review loops; treat those workflows as high-impact research automation rather than passive documentation.
- The `research-pipeline` skill supports auto-proceed modes and reviewer loops. Keep expensive runs, repository mutations, cloud/GPU jobs, and paper-submission decisions behind explicit human approval.
- Cross-model review through Codex MCP, Claude-review, Gemini-review, or similar reviewer adapters is a quality-control signal, not scientific proof or peer review.
- Generated claims, citations, tables, plots, ablations, rebuttals, and paper text need source checks, experiment audits, citation audits, and human scientific review before being relied on or submitted.
- Review all copied skills, scripts, MCP server configuration, and reviewer routing before installing them into a sensitive repository or giving them shell, file, web, cloud, or GPU access.
Privacy notes
- Research automation can expose unpublished hypotheses, paper drafts, peer-review text, datasets, logs, source code, experiment traces, model outputs, reviewer comments, account names, and GPU or cloud configuration to the selected model providers and MCP tools.
- Cross-model review loops may send the same research artifact to multiple providers or local/remote reviewer services depending on configuration.
- Research Wiki, traces, generated reports, paper artifacts, and run logs can persist confidential results or private review material on disk.
- Do not share confidential reviews, unreleased findings, private datasets, credentials, proprietary code, or submission-sensitive artifacts with external services unless the research and account policies allow it.
Prerequisites
- A research project, ML paper idea, baseline repository, dataset, review packet, or experiment plan that is appropriate for agent-assisted research automation.
- A compatible agent host that can consume Markdown skills, such as Claude Code, Codex, Cursor, OpenClaw, Antigravity, Trae, GitHub Copilot CLI, or a manual prompt workflow.
- Model-provider credentials, MCP reviewer configuration, or local model routing only when using cross-model review or external reviewer loops.
- Compute budget, GPU quota, experiment sandboxing, version control, and artifact directories before allowing autonomous experiment execution.
- Human review checkpoints for expensive runs, paper claims, rebuttal text, submission decisions, and any use of unpublished or confidential research material.
Schema details
- Install type
- package
- Reading time
- 9 min
- Difficulty score
- 90
- Troubleshooting
- Yes
- Breaking changes
- No
- Scope
- Source repo
- Skill type
- capability-pack
- Skill level
- expert
- Verification
- validated
- Verified at
- 2026-06-18
| Platform | Support | Install path |
|---|---|---|
| claude-code | Native | .claude/skills/<skill-name>/SKILL.md |
| codex | Native | .agents/skills/<skill-name>/SKILL.md |
| windsurf | Native | .windsurf/skills/<skill-name>/SKILL.md |
| gemini | Native | .gemini/skills/<skill-name>/SKILL.md or .agents/skills/<skill-name>/SKILL.md |
| cursor | Adapter | .cursor/rules/<skill-name>.mdc |
| cli | Manual | AGENTS.md or tool-specific context file |
Full copyable content
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep.git
# Copy selected skills into your agent skill directory.
# Codex users can review the skills/skills-codex mirror first.About this resource
ARIS Auto-Research-In-Sleep
wanshuiyin/Auto-claude-code-research-in-sleep publishes ARIS, a
Markdown-first research-agent workflow system. The repository describes ARIS as
a methodology rather than a platform: its core behavior is stored in
SKILL.md files that can be copied into Claude Code, Codex, Cursor, OpenClaw,
Antigravity, GitHub Copilot CLI, Trae, or other Markdown-compatible agent
workflows.
Use this listing for the ARIS skill workflow pack itself. Use narrower entries when reviewing a single ARIS skill, a separate MCP reviewer server, or a platform-specific adaptation.
Knowledge Freshness
The repository is active, with current release and GitHub metadata verified on
2026-06-18. The README and AGENT_GUIDE.md document Claude Code usage, Codex
skill mirrors, cross-model reviewer routes, OpenClaw adaptation notes, and a
large catalog of research skills.
Research tooling, model behavior, MCP reviewer setup, OpenClaw support, GPU environments, paper venues, citation sources, and ML baselines can change quickly. Treat ARIS as a structured research automation harness, then verify commands, dependencies, venue rules, citations, datasets, reviewer configuration, and experiment evidence in the target project before relying on generated results.
Retrieval Sources
This listing is grounded in:
- The upstream repository README.
- The upstream
AGENT_GUIDE.md. - Representative ARIS
SKILL.mdfiles for research pipeline, idea discovery, experiment bridge, auto-review loop, paper writing, and rebuttal workflows. - The shared assurance contract reference.
- The OpenClaw adaptation guide.
- The MIT license file.
- Current GitHub repository metadata.
Core Workflow
Clone the repository:
git clone https://github.com/wanshuiyin/Auto-claude-code-research-in-sleep.git
Review the relevant skill files, then copy only the skills you want into your
agent host. Codex users should review the skills/skills-codex/ mirror before
copying, while OpenClaw users should read the OpenClaw adaptation guide and map
skills to file-first task phases.
Common ARIS entry points include:
/research-pipeline "factorized gap in discrete diffusion LMs"
/idea-discovery "improve method X"
/experiment-bridge "turn this idea into an experiment plan"
/auto-review-loop "review paper/ and experiments/"
/paper-writing "results/ + notes/"
/rebuttal "paper/ + reviews" -- venue: ICML
Capability Scope
The upstream AGENT_GUIDE.md organizes ARIS around research lifecycle
workflows:
| Workflow | Scope |
|---|---|
/idea-discovery |
Survey a research direction, find gaps, and propose candidate ideas |
/experiment-bridge |
Turn a selected idea into experiment plans, runbooks, and implementation steps |
/auto-review-loop |
Run adversarial or cross-model reviews against code, experiments, and papers |
/paper-writing |
Convert evidence, results, and claims into paper drafts |
/rebuttal |
Draft response material from paper files and reviewer comments |
/resubmit-pipeline |
Prepare revision and resubmission workflows |
/paper-talk |
Produce talk-oriented paper presentation material |
/paper-slides |
Generate slide artifacts from a paper |
/paper-poster-html |
Generate poster-oriented HTML artifacts |
/research-wiki |
Maintain a persistent research knowledge record |
The guide also documents assurance and audit skills such as
/experiment-audit, /result-to-claim, /paper-claim-audit,
/citation-audit, and /kill-argument. Those checks are important because
research-agent output can look polished while still being weak, stale, or
unsupported.
Platform Fit
ARIS is built around portable Markdown skills rather than a single hosted runtime. The repository documents several usage patterns:
- Claude Code skills under
skills/<name>/SKILL.md. - Codex-specific mirrors under
skills/skills-codex/. - Codex plus Claude-review or Gemini-review overlays under the corresponding reviewer skill folders.
- Cursor, Trae, Antigravity, GitHub Copilot CLI, and manual Markdown skill usage.
- OpenClaw adaptation through a file-first phase map that writes outputs such
as
lit_scan.md,idea_report.md,experiment_plan.md,runbook.md, andreview_loop.md.
This makes ARIS useful for teams comparing "Claude Code research pipeline", "Codex research skills", "OpenClaw research workflow", and "cross-model review loop" approaches without committing to one agent runtime immediately.
Production Rules
- Start with a narrow research objective and define success criteria before invoking an autonomous pipeline.
- Keep repository changes, shell execution, GPU/cloud jobs, and long-running loops behind checkpoints until the workflow is trusted.
- Use the assurance and audit skills before turning results into paper claims.
- Verify citations, datasets, baselines, venue rules, and license constraints against primary sources.
- Keep unpublished reviews, internal results, private datasets, and proprietary code out of prompts sent to providers that are not approved for that data.
- Treat cross-model reviewer output as adversarial feedback, not a replacement for human ML review.
Source Review
- The repository README describes ARIS as "lightweight Markdown-only skills" for autonomous ML research, cross-model review loops, idea discovery, and experiment automation.
- The README states that ARIS works with Claude Code, Codex, Cursor, Trae, Antigravity, GitHub Copilot CLI, OpenClaw, and standalone/manual workflows.
AGENT_GUIDE.mddescribes ARIS as a research harness whose source of truth isskills/<name>/SKILL.mdplus shared reference contracts.- The guide documents 79 skills and maps common workflows across idea discovery, experiment bridging, review loops, paper writing, rebuttal, resubmission, talks, slides, posters, audits, and Research Wiki usage.
skills/research-pipeline/SKILL.mdchains idea discovery, experiment bridge, auto-review loop, and paper writing, with parameters for auto-proceed, human checkpoints, code review, base repositories, venues, HTML rendering, and resumable execution.docs/OPENCLAW_ADAPTATION.mdmaps ARIS workflows into OpenClaw-friendly, file-first phases with explicit output artifacts.- The repository license is MIT.
Duplicate Check
No existing HeyClaude content entry or open PR was found for ARIS,
Auto-Research-In-Sleep, or wanshuiyin/Auto-claude-code-research-in-sleep at
the time this listing was created.
Disclosure
This is a community project listing based on public source material. It is not an endorsement that autonomous research results, generated experiments, paper claims, or rebuttal drafts are correct, novel, reproducible, or venue-ready without independent review.
Source citations
Add this badge to your README
Show that ARIS Auto-Research-In-Sleep is listed on HeyClaude. Paste this Markdown into your README — it renders the badge and links back to this page.
[](https://heyclau.de/entry/skills/aris-auto-research-in-sleep)How it compares
ARIS Auto-Research-In-Sleep side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.
| Field | ARIS Auto-Research-In-Sleep ARIS is a Markdown-only skill workflow pack for autonomous ML research agents, with idea discovery, experiment planning, auto-review loops, paper writing, rebuttal, resubmission, slides, posters, Research Wiki, and cross-model reviewer workflows for Claude Code, Codex, OpenClaw, Cursor, and other agent hosts. Open dossier | Addy Osmani Agent Skills Addy Osmani's production-grade Agent Skills pack for AI coding agents, with lifecycle slash commands, engineering workflow skills, review personas, quality gates, and cross-agent setup guidance for Claude Code, Cursor, Gemini CLI, Antigravity CLI, OpenCode, GitHub Copilot, and other agents. Open dossier | Context Engineering Agent Skills MIT-licensed Agent Skills collection for context engineering, harness engineering, multi-agent architectures, filesystem context, memory systems, tool design, evaluation, hosted agents, and production agent operating loops for Claude Code, Cursor, Codex, and Open Plugins-compatible agent tools. Open dossier | LiveKit Agent Skills Official LiveKit Agent Skills for AI coding agents building low-latency voice AI, LiveKit Agents workflows, handoffs, mandatory tests, and simulation scenario suites. Open dossier |
|---|---|---|---|---|
| Trust | ||||
| Install risk | Review first | Review first | Review first | Review first |
| Notes | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ |
| Category | skills | skills | skills | skills |
| Source | source-backed | source-backed | source-backed | source-backed |
| Author | wanshuiyin | Addy Osmani | Muratcan Koylan | LiveKit |
| Added | 2026-06-18 | 2026-06-18 | 2026-06-18 | 2026-06-18 |
| Platforms | Claude CodeCodexWindsurfGeminiCursorCLI | Claude CodeCodexWindsurfGeminiCursorCLI | Claude CodeCodexWindsurfGeminiCursorCLI | Claude CodeCodexWindsurfGeminiCursorCLI |
| Source repo | — | — | — | — |
| Safety notes | ✓ARIS skills can guide agents through code changes, experiment planning, experiment execution, paper drafting, rebuttal drafting, and cross-model review loops; treat those workflows as high-impact research automation rather than passive documentation. The `research-pipeline` skill supports auto-proceed modes and reviewer loops. Keep expensive runs, repository mutations, cloud/GPU jobs, and paper-submission decisions behind explicit human approval. Cross-model review through Codex MCP, Claude-review, Gemini-review, or similar reviewer adapters is a quality-control signal, not scientific proof or peer review. Generated claims, citations, tables, plots, ablations, rebuttals, and paper text need source checks, experiment audits, citation audits, and human scientific review before being relied on or submitted. Review all copied skills, scripts, MCP server configuration, and reviewer routing before installing them into a sensitive repository or giving them shell, file, web, cloud, or GPU access. | ✓The slash commands are designed to guide real coding, testing, reviewing, committing, and shipping work; keep edits, commits, pushes, CI changes, and deploys behind the host's normal approval controls. `/build auto` is explicitly intended to generate a plan and implement multiple tasks in one approved pass. Use it on bounded specs, review the generated plan first, and stop on test failures or risky changes. The skills encode durable engineering workflows, not guaranteed-current framework APIs. Follow the source-driven-development guidance and verify current documentation before applying generated code. Security, CI/CD, observability, migration, and launch skills can touch production-sensitive systems. Require dry-run plans, rollback notes, and environment scoping before approving operational commands. Review personas and quality gates are useful second opinions, but they do not replace maintainer review, domain-specific tests, threat modeling, or release sign-off. | ✓These skills alter how agents select context, delegate work, persist state, design tools, evaluate outputs, and operate autonomous loops; use them as engineering guidance, not as automatic authority to change a production agent system. Filesystem-context and memory-system patterns can cause agents to write durable plans, scratchpads, logs, summaries, preferences, or shared handoff files. Keep cleanup, ownership, and review rules explicit. Harness-engineering, hosted-agent, and evaluation workflows can launch long-running loops, background agents, benchmark suites, paid model calls, or remote sandbox work. Require budgets, kill switches, rollback rules, and approval gates. Tool-design guidance can change MCP schemas, tool descriptions, return formats, and error contracts. Test routing and compatibility before deploying changes to users. Benchmark results are source evidence for this repository's claims, but they are workload-specific. Re-run or adapt benchmarks before relying on the reported routing numbers in a different agent stack. | ✓The livekit-agents skill intentionally pushes agents toward implementation work for voice AI systems that can join realtime rooms, call tools, speak to users, and route calls; generated code still needs human review. The skill requires tests for agent behavior, but tests do not prove latency, safety, consent, telephony legality, privacy, or production readiness by themselves. The livekit-simulations skill includes private-beta caveats for simulation commands and requires current CLI help or docs verification before running `lk agent simulate`. Do not let a coding agent invent LiveKit API signatures from memory; the skill repeatedly requires MCP/docs verification because the SDK changes quickly. Voice agent handoffs, tasks, tool calls, and simulation scenarios can influence real user conversations if deployed; validate in staging rooms before production. |
| Privacy notes | ✓Research automation can expose unpublished hypotheses, paper drafts, peer-review text, datasets, logs, source code, experiment traces, model outputs, reviewer comments, account names, and GPU or cloud configuration to the selected model providers and MCP tools. Cross-model review loops may send the same research artifact to multiple providers or local/remote reviewer services depending on configuration. Research Wiki, traces, generated reports, paper artifacts, and run logs can persist confidential results or private review material on disk. Do not share confidential reviews, unreleased findings, private datasets, credentials, proprietary code, or submission-sensitive artifacts with external services unless the research and account policies allow it. | ✓Using the pack with an AI agent can expose repository code, product requirements, architecture notes, tests, CI logs, deployment settings, incidents, security findings, and launch plans to the configured model provider. Do not paste secrets, customer data, private incident records, production credentials, unpublished roadmap details, or proprietary compliance material into public prompts, issues, screenshots, or PR bodies. Agent personas and review workflows may ask for browser traces, performance data, logs, build output, dependency lists, and environment details; redact tokens and private URLs before sharing artifacts. | ✓Context-engineering work often touches prompts, system instructions, tool definitions, retrieved documents, message history, tool outputs, logs, scratch files, memory stores, benchmark prompts, and model responses. Do not persist secrets, customer data, private source code, incident data, unpublished strategy, or regulated records into scratchpads, skill examples, benchmark fixtures, or shared agent workspaces. If benchmark runners or hosted-agent examples call external models or remote sandboxes, review what prompts, traces, files, and logs are sent outside the local workspace. Agent memory and filesystem-context patterns should include deletion, redaction, retention, and access-control rules before being used with private projects. | ✓LiveKit voice agent work can involve audio, video, transcripts, room metadata, participant identities, phone call details, test personas, tool inputs, tool outputs, and logs. The skills are prompt/instruction assets, but the implementations they guide may send data to LiveKit, STT providers, LLM providers, TTS providers, MCP servers, telephony providers, and observability backends. Keep LIVEKIT_API_SECRET, provider keys, SIP credentials, room tokens, recordings, transcripts, and generated scenario files containing sensitive business logic out of prompts, public issues, screenshots, and committed configs. The simulations skill says scenario generation reads the user's local agent code and should not upload that code; preserve that local-only boundary when using it. |
| Prerequisites |
|
|
|
|
| Install | | | | |
| Config | — | — | — | — |
| Citations | ||||
| Claim | Unclaimed | Unclaimed | Unclaimed | Unclaimed |
Featured in
Signals
Loading live community signals…
A short, calm digest of reviewed Claude resources. Unsubscribe any time.