DataHub MCP Server for Claude
Connect Claude to DataHub — search assets, trace data lineage, explore schemas, retrieve real query history, and enrich metadata — with the official DataHub MCP server from Acryl Data, supporting the full DataHub data catalog platform.
Open the source and read safety notes before installing.
Safety notes
- Metadata mutation tools (add/remove tags, terms, owners) are disabled by default — set `TOOLS_IS_MUTATION_ENABLED=true` to enable write operations.
- Mutations affect production data catalog metadata; confirm before enabling in shared environments.
Privacy notes
- Dataset schemas, table previews, lineage graphs, real SQL query history, and data owner information from your DataHub instance are surfaced in Claude's context.
- Your `DATAHUB_GMS_TOKEN` grants catalog access — treat it as a secret.
Prerequisites
- A running DataHub instance (self-hosted or DataHub Cloud) with GMS endpoint accessible.
- A DataHub personal access token: Settings → Access Tokens → + Generate Access Token.
- Python with `uv` or `uvx` available.
- An MCP client such as Claude Code or Claude Desktop.
Schema details
- Install type
- cli
- Troubleshooting
- No
- Scope
- Source repo
- Estimated setup
- 10 minutes
- Difficulty
- intermediate
- Website
- https://datahubproject.io
Full copyable content
{
"mcpServers": {
"datahub": {
"command": "uvx",
"args": ["mcp-server-datahub"],
"env": {
"DATAHUB_GMS_URL": "https://your-datahub.example.com/api/gms",
"DATAHUB_GMS_TOKEN": "<your-token>"
}
}
}
}About this resource
Overview
The DataHub MCP Server is the official Model Context Protocol server from Acryl Data for
DataHub — the open source data catalog and metadata platform.
It gives Claude structured access to your DataHub instance: search assets, trace lineage,
explore schemas, retrieve real SQL queries run against datasets, and enrich metadata through
optional mutation tools. Distributed as mcp-server-datahub on PyPI. Apache-2.0 licensed.
Key capabilities
- Asset search — keyword, boolean, and structured search with filters (type, tag, owner, domain).
- Lineage tracing — upstream and downstream data lineage across the full pipeline.
- Lineage path finding — compute lineage paths between two specific assets.
- Query history — retrieve real SQL queries that have been run against a dataset.
- Schema exploration — list and inspect dataset schema fields and column types.
- Batch metadata retrieval — get metadata for multiple assets in one call.
- Document search — keyword and regex search over DataHub knowledge documents.
- Metadata mutation (opt-in) — add/remove tags, glossary terms, owners; opt in via env var.
Tools (key selection)
| Tool | Purpose |
|---|---|
search |
Keyword/boolean search with filters across all asset types |
get_lineage |
Upstream/downstream lineage for any asset |
get_lineage_paths_between |
Compute paths between two specific assets |
get_dataset_queries |
Real SQL queries run against a dataset |
list_schema_fields |
Explore dataset schema structure |
get_entities |
Batch metadata retrieval for multiple assets |
search_documents / grep_documents |
Keyword and regex search over docs |
save_document |
Store notes/docs in DataHub knowledge base |
add_tags / remove_tags |
Tag management (mutation opt-in required) |
add_terms / remove_terms |
Glossary term management (mutation opt-in required) |
Configuration options
| Env var | Default | Purpose |
|---|---|---|
DATAHUB_GMS_URL |
— | Required. GMS endpoint URL |
DATAHUB_GMS_TOKEN |
— | Required. Personal access token |
TOOLS_IS_MUTATION_ENABLED |
false |
Enable write/mutation tools |
TOOLS_IS_USER_ENABLED |
false |
Enable user management tools |
SEMANTIC_SEARCH_ENABLED |
false |
Enable vector semantic search |
TOOL_RESPONSE_TOKEN_LIMIT |
80000 |
Cap response token size |
How it compares
| Server | Asset search | Lineage | Query history | Mutations | Notes |
|---|---|---|---|---|---|
| DataHub MCP | Yes | Yes | Yes | Opt-in | Full catalog; read-only by default |
| OpenMetadata MCP | Yes | Yes | Limited | Yes | Alternative open catalog |
| dbt MCP | Limited | Limited | No | No | dbt projects only |
| Atlan MCP | Yes | Yes | Limited | Limited | Commercial catalog |
DataHub MCP is unique in exposing real query history — actual SQL that ran against your data, not just schema metadata.
Installation
Claude Code
claude mcp add datahub \
-e DATAHUB_GMS_URL=https://your-datahub.example.com/api/gms \
-e DATAHUB_GMS_TOKEN=<your-token> \
-- uvx mcp-server-datahub
Claude Desktop
{
"mcpServers": {
"datahub": {
"command": "uvx",
"args": ["mcp-server-datahub"],
"env": {
"DATAHUB_GMS_URL": "https://your-datahub.example.com/api/gms",
"DATAHUB_GMS_TOKEN": "<your-token>"
}
}
}
}
For DataHub Cloud, the GMS URL is typically https://your-org.acryl.io/api/gms.
Requirements
- DataHub instance (self-hosted via Docker Compose or DataHub Cloud).
- Personal access token.
- Python with
uvx(fromuv). - An MCP client (Claude Code or Claude Desktop).
Security
- Mutation tools are off by default — only enable
TOOLS_IS_MUTATION_ENABLED=trueif you need to write tags, terms, or owners. - Token grants catalog read access; protect it like any API credential.
Source Verification Notes
Verified on 2026-06-18:
- Official repository
acryldata/mcp-server-datahub(Apache-2.0) on PyPI asmcp-server-datahub(v0.6.0) documents theDATAHUB_GMS_URL/DATAHUB_GMS_TOKENconfiguration, all search/lineage/ query/metadata tools, the optional mutation and semantic search env vars, and theuvxinstall. - DataHub feature guide at
docs.datahub.com/docs/features/feature-guides/mcpdescribes the integration setup. - Claude Code MCP documentation at
code.claude.com/docs/en/mcpdescribes the stdio connector pattern used above.
Source citations
Add this badge to your README
How it compares
DataHub MCP Server for Claude side by side with 3 alternatives on trust, install, platform support, and disclosed safety notes — all from reviewed registry metadata.
| Field | DataHub MCP Server for Claude Connect Claude to DataHub — search assets, trace data lineage, explore schemas, retrieve real query history, and enrich metadata — with the official DataHub MCP server from Acryl Data, supporting the full DataHub data catalog platform. Open dossier | OpenMetadata MCP Server OpenMetadata MCP OAuth server that lets Claude-compatible clients access OpenMetadata metadata tools through user SSO or Basic Auth, PKCE, dynamic client registration, and OpenMetadata's normal permission model. Open dossier | Backlog MCP Server for Claude Manage Backlog projects from Claude — create and update issues, comment on tickets, manage wiki pages, review pull requests, and navigate your Nulab Backlog space — with the official Backlog MCP server supporting stdio and HTTP transports. Open dossier | ConfigCat MCP Server for Claude Manage ConfigCat feature flags from Claude — create, update, and delete flags and targeting rules, manage environments, find and clean up stale flags, and audit change history — with the official ConfigCat MCP server and its 52 tools for the full ConfigCat Management API. Open dossier |
|---|---|---|---|---|
| Trust | ||||
| Install risk | Review first | Review first | Review first | Review first |
| Notes | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ | Safety ✓ Privacy ✓ |
| Category | mcp | mcp | mcp | mcp |
| Source | source-backed | source-backed | source-backed | source-backed |
| Author | Acryl Data | OpenMetadata | Nulab | ConfigCat |
| Added | 2026-06-18 | 2026-06-06 | 2026-06-18 | 2026-06-18 |
| Platforms | Claude CodeCodexCursorClaude Desktop | Claude CodeClaude Desktop | Claude CodeClaude Desktop | Claude CodeClaude Desktop |
| Source repo | — | — | — | — |
| Safety notes | ✓Metadata mutation tools (add/remove tags, terms, owners) are disabled by default — set `TOOLS_IS_MUTATION_ENABLED=true` to enable write operations. Mutations affect production data catalog metadata; confirm before enabling in shared environments. | ✓OpenMetadata MCP tools execute with the authenticated user's OpenMetadata permissions, not with a generic connector identity. OAuth support includes Authorization Code Flow with PKCE, dynamic client registration, refresh-token rotation, encrypted token storage, rate limiting, CORS configuration, and audit logging. Tools can search metadata, inspect entities, use semantic search, patch entities, create glossary resources, and edit lineage depending on the deployed tool set and user permissions. Metadata mutations can change catalog governance, lineage, ownership, descriptions, tags, glossary definitions, and downstream trust signals. Require approval before entity patching, glossary changes, lineage edits, or any action that changes catalog state used by analysts, data products, or compliance workflows. | ✓Tools can create, update, and delete projects, issues, wikis, and pull requests — changes affect your live Backlog space. Use `ENABLE_TOOLSETS` to restrict which tool groups are available if you only need read access. | ✓Tools can create, update, and delete feature flags, targeting rules, environments, and segments — changes affect live feature flag configuration. Use `list-staleflags` before deleting flags to identify zombie flags and avoid breaking active SDKs. |
| Privacy notes | ✓Dataset schemas, table previews, lineage graphs, real SQL query history, and data owner information from your DataHub instance are surfaced in Claude's context. Your `DATAHUB_GMS_TOKEN` grants catalog access — treat it as a secret. | ✓OpenMetadata catalog content can reveal table names, columns, schemas, dashboards, pipelines, services, owners, tags, classifications, glossary terms, lineage, usage patterns, data-quality context, and internal business semantics. OAuth access tokens, refresh tokens, authorization codes, client IDs, redirect URIs, CORS origins, SSO identities, audit logs, and MCP transcripts can contain sensitive security or user information. Search and semantic-search results can surface restricted metadata if roles, policies, ownership, or impersonation settings are misconfigured. Redact tokens, user identifiers, internal hostnames, private service names, lineage diagrams, and regulated metadata before sharing prompts, logs, screenshots, or generated notes. | ✓Issue content, comments, wiki pages, pull request details, and user information from your Backlog space are surfaced in Claude's context. Your `BACKLOG_API_KEY` grants account-level Backlog access — keep it in the MCP config env and never commit it to version control. | ✓Feature flag configurations, targeting rules, audience segments, SDK keys, and audit log entries from your ConfigCat account are surfaced in Claude's context. `CONFIGCAT_API_USER` and `CONFIGCAT_API_PASS` are Management API credentials — keep them in the MCP config env and never commit them to version control. |
| Prerequisites |
|
|
|
|
| Install | | | | |
| Config | | | | |
| Citations | ||||
| Claim | Unclaimed | Unclaimed | Unclaimed | Unclaimed |
Signals
Loading live community signals…
A short, calm digest of reviewed Claude resources. Unsubscribe any time.