Overview
The Metadata Agent is responsible for keeping your data catalog current, understandable, and well-documented. It connects to OpenMetadata (OMD) via MCP to read and enrich metadata across your entire data estate — generating business-friendly descriptions, managing tags, understanding schema definitions, and tracing data lineage.What You Can Ask
- “Describe the customers table” — Generates a business-focused description based on schema and sample data
- “What columns are in the orders dataset?” — Returns full schema with data types and existing documentation
- “Add a description to the revenue column” — Updates column-level metadata in OpenMetadata
- “Tag the email column as PII” — Applies sensitivity classifications to columns
- “Show me the lineage for the sales_summary table” — Traces upstream sources and downstream dependencies
- “What tables are related to inventory?” — Semantic search across your catalog to find relevant assets
How It Works
OpenMetadata Integration
The Metadata Agent connects to OpenMetadata through the Model Context Protocol (MCP), providing structured, validated access to your full data catalog.Schema Exploration
Browse databases, schemas, and tables. View column names, data types, constraints, and existing documentation — all through the OpenMetadata catalog.
Description Generation
LLM-powered description generation that analyzes schema structure, column patterns, and sample data to produce business-focused descriptions that explain what the data means, not just what it contains.
Metadata Enrichment
Update table and column descriptions, apply PII sensitivity tags, and add documentation — all written back to OpenMetadata via JSON Patch operations.
Lineage Tracking
Trace data relationships and dependencies across your estate — see which tables feed into which, understand upstream sources, and follow transformations through the pipeline.
Description Generation
When you ask the Metadata Agent to describe a data asset, it goes beyond reading existing documentation. It uses an LLM to generate business-focused descriptions by:- Retrieving metadata — Fetches the full schema from OpenMetadata (columns, types, constraints)
- Assessing context — Determines whether the schema alone provides enough context, or if sample data is needed
- Fetching sample data (when needed) — Queries your Redshift warehouse for a representative sample to understand actual data patterns
- Generating descriptions — Produces 2-3 sentence descriptions focused on the distinctive business characteristics of the data — what it represents, how it’s used, and what makes it unique
- Saving to catalog — Writes the generated description back to OpenMetadata and Neo4j so it’s available across the platform
Schema Operations
The Metadata Agent can explore your full OpenMetadata catalog hierarchy:| Level | Operations |
|---|---|
| Database | List databases, view database details, browse schemas within a database |
| Schema | List schemas, view schema details, browse tables within a schema |
| Table | List tables, get full table metadata, view column definitions and types |
| Column | View data types, constraints, existing descriptions, sensitivity tags |
| Lineage | Trace upstream sources and downstream consumers for any table |
Metadata Enrichment
The agent can update metadata directly in OpenMetadata using structured patch operations:Table Descriptions
Add or update table-level descriptions that explain the business purpose and context of each data asset.
Column Descriptions
Document individual columns with business-friendly explanations — what each field represents and how it should be interpreted.
PII Tags
Apply sensitivity classifications to columns containing personal information — emails, SSNs, phone numbers — using OpenMetadata’s PII tagging system.
Search & Discovery
Semantic search across your entire catalog using vector embeddings — find tables by what they contain, not just what they’re named.
Data Asset Discovery
Finding the right data asset is the first step in any metadata operation. The Metadata Agent uses confidence-based discovery to surface the most relevant assets:| Confidence | Threshold | Behavior |
|---|---|---|
| Strong Match | > 60% similarity | Proceeds automatically with the best match |
| Possible Match | 40–60% similarity | Presents options and asks you to confirm |
| Uncertain | < 40% similarity | Asks you to clarify or refine your request |
Dual Catalog Architecture
The Metadata Agent works across two complementary systems:- OpenMetadata stores the detailed catalog metadata — schemas, column definitions, descriptions, PII tags, lineage, and quality metrics
- Neo4j stores the relationship graph — how data assets connect to workspaces, organizations, and each other, enabling GraphRAG-powered discovery
The Metadata Agent works alongside the Governance Agent for policy compliance and the Quality Agent for data quality checks. Together they keep your data estate documented, governed, and healthy.

