Metadata Agent - Brighthive

Overview

The Metadata Agent is responsible for keeping your data catalog current, understandable, and well-documented. It connects to OpenMetadata (OMD) via MCP to read and enrich metadata across your entire data estate — generating business-friendly descriptions, managing tags, understanding schema definitions, and tracing data lineage.

What You Can Ask

“Describe the customers table” — Generates a business-focused description based on schema and sample data
“What columns are in the orders dataset?” — Returns full schema with data types and existing documentation
“Add a description to the revenue column” — Updates column-level metadata in OpenMetadata
“Tag the email column as PII” — Applies sensitivity classifications to columns
“Show me the lineage for the sales_summary table” — Traces upstream sources and downstream dependencies
“What tables are related to inventory?” — Semantic search across your catalog to find relevant assets

How It Works

OpenMetadata Integration

The Metadata Agent connects to OpenMetadata through the Model Context Protocol (MCP), providing structured, validated access to your full data catalog.

Schema Exploration

Browse databases, schemas, and tables. View column names, data types, constraints, and existing documentation — all through the OpenMetadata catalog.

Description Generation

LLM-powered description generation that analyzes schema structure, column patterns, and sample data to produce business-focused descriptions that explain what the data means, not just what it contains.

Metadata Enrichment

Update table and column descriptions, apply PII sensitivity tags, and add documentation — all written back to OpenMetadata via JSON Patch operations.

Lineage Tracking

Trace data relationships and dependencies across your estate — see which tables feed into which, understand upstream sources, and follow transformations through the pipeline.

Description Generation

When you ask the Metadata Agent to describe a data asset, it goes beyond reading existing documentation. It uses an LLM to generate business-focused descriptions by:

Retrieving metadata — Fetches the full schema from OpenMetadata (columns, types, constraints)
Assessing context — Determines whether the schema alone provides enough context, or if sample data is needed
Fetching sample data (when needed) — Queries your Redshift warehouse for a representative sample to understand actual data patterns
Generating descriptions — Produces 2-3 sentence descriptions focused on the distinctive business characteristics of the data — what it represents, how it’s used, and what makes it unique
Saving to catalog — Writes the generated description back to OpenMetadata and Neo4j so it’s available across the platform

Descriptions are written for business users — they explain what the data means in context, not how it was collected or stored.

Schema Operations

The Metadata Agent can explore your full OpenMetadata catalog hierarchy:

Level	Operations
Database	List databases, view database details, browse schemas within a database
Schema	List schemas, view schema details, browse tables within a schema
Table	List tables, get full table metadata, view column definitions and types
Column	View data types, constraints, existing descriptions, sensitivity tags
Lineage	Trace upstream sources and downstream consumers for any table

Metadata Enrichment

The agent can update metadata directly in OpenMetadata using structured patch operations:

Table Descriptions

Add or update table-level descriptions that explain the business purpose and context of each data asset.

Column Descriptions

Document individual columns with business-friendly explanations — what each field represents and how it should be interpreted.

PII Tags

Apply sensitivity classifications to columns containing personal information — emails, SSNs, phone numbers — using OpenMetadata’s PII tagging system.

Search & Discovery

Semantic search across your entire catalog using vector embeddings — find tables by what they contain, not just what they’re named.

Data Asset Discovery

Finding the right data asset is the first step in any metadata operation. The Metadata Agent uses confidence-based discovery to surface the most relevant assets:

Confidence	Threshold	Behavior
Strong Match	> 60% similarity	Proceeds automatically with the best match
Possible Match	40–60% similarity	Presents options and asks you to confirm
Uncertain	< 40% similarity	Asks you to clarify or refine your request

Discovery searches across both vector embeddings (semantic meaning) and OpenMetadata catalog (structured metadata) to find assets that match your intent — even if you don’t know the exact table name.

Dual Catalog Architecture

The Metadata Agent works across two complementary systems:

OpenMetadata stores the detailed catalog metadata — schemas, column definitions, descriptions, PII tags, lineage, and quality metrics
Neo4j stores the relationship graph — how data assets connect to workspaces, organizations, and each other, enabling GraphRAG-powered discovery

When the agent updates a description, it writes to both systems — keeping the catalog and the knowledge graph in sync.

The Metadata Agent works alongside the Governance Agent for policy compliance and the Quality Agent for data quality checks. Together they keep your data estate documented, governed, and healthy.

​Overview

​What You Can Ask

​How It Works

​OpenMetadata Integration

Schema Exploration

Description Generation

Metadata Enrichment

Lineage Tracking

​Description Generation

​Schema Operations

​Metadata Enrichment

Table Descriptions

Column Descriptions

PII Tags

Search & Discovery

​Data Asset Discovery

​Dual Catalog Architecture

Overview

What You Can Ask

How It Works

OpenMetadata Integration

Description Generation

Schema Operations

Metadata Enrichment

Data Asset Discovery

Dual Catalog Architecture