> ## Documentation Index
> Fetch the complete documentation index at: https://docs.brighthive.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Metadata Agent

> The Metadata Agent uses OpenMetadata to generate descriptions, understand schemas, enrich your data catalog, and track lineage — keeping your data assets documented and discoverable.

## Overview

The Metadata Agent is responsible for keeping your data catalog current, understandable, and well-documented. It connects to **OpenMetadata (OMD)** via MCP to read and enrich metadata across your entire data estate — generating business-friendly descriptions, managing tags, understanding schema definitions, and tracing data lineage.

## What You Can Ask

* *"Describe the customers table"* — Generates a business-focused description based on schema and sample data
* *"What columns are in the orders dataset?"* — Returns full schema with data types and existing documentation
* *"Add a description to the revenue column"* — Updates column-level metadata in OpenMetadata
* *"Tag the email column as PII"* — Applies sensitivity classifications to columns
* *"Show me the lineage for the sales\_summary table"* — Traces upstream sources and downstream dependencies
* *"What tables are related to inventory?"* — Semantic search across your catalog to find relevant assets

## How It Works

```mermaid theme={null}
graph TD
    A[User Request] --> B[Metadata Agent]
    B --> C{What's needed?}
    C -->|Discovery| D["Vector Search + OpenMetadata"]
    C -->|Schema Details| E["OpenMetadata: get_table"]
    C -->|Description Generation| F["LLM Analyzes Schema + Sample Data"]
    C -->|Metadata Update| G["OpenMetadata: patch_entity"]
    C -->|Lineage| H["OpenMetadata: get_entity_lineage"]
    D --> I[Results to User]
    E --> I
    F --> G
    G --> I
    H --> I
```

## OpenMetadata Integration

The Metadata Agent connects to OpenMetadata through the **Model Context Protocol (MCP)**, providing structured, validated access to your full data catalog.

<CardGroup cols={2}>
  <Card title="Schema Exploration" icon="table-columns">
    Browse databases, schemas, and tables. View column names, data types, constraints, and existing documentation — all through the OpenMetadata catalog.
  </Card>

  <Card title="Description Generation" icon="pen-to-square">
    LLM-powered description generation that analyzes schema structure, column patterns, and sample data to produce business-focused descriptions that explain what the data *means*, not just what it contains.
  </Card>

  <Card title="Metadata Enrichment" icon="tags">
    Update table and column descriptions, apply PII sensitivity tags, and add documentation — all written back to OpenMetadata via JSON Patch operations.
  </Card>

  <Card title="Lineage Tracking" icon="share-nodes">
    Trace data relationships and dependencies across your estate — see which tables feed into which, understand upstream sources, and follow transformations through the pipeline.
  </Card>
</CardGroup>

## Description Generation

When you ask the Metadata Agent to describe a data asset, it goes beyond reading existing documentation. It uses an LLM to generate business-focused descriptions by:

1. **Retrieving metadata** — Fetches the full schema from OpenMetadata (columns, types, constraints)
2. **Assessing context** — Determines whether the schema alone provides enough context, or if sample data is needed
3. **Fetching sample data** (when needed) — Queries your Redshift warehouse for a representative sample to understand actual data patterns
4. **Generating descriptions** — Produces 2-3 sentence descriptions focused on the distinctive business characteristics of the data — what it represents, how it's used, and what makes it unique
5. **Saving to catalog** — Writes the generated description back to OpenMetadata and Neo4j so it's available across the platform

Descriptions are written for business users — they explain what the data *means* in context, not how it was collected or stored.

## Schema Operations

The Metadata Agent can explore your full OpenMetadata catalog hierarchy:

| Level        | Operations                                                              |
| ------------ | ----------------------------------------------------------------------- |
| **Database** | List databases, view database details, browse schemas within a database |
| **Schema**   | List schemas, view schema details, browse tables within a schema        |
| **Table**    | List tables, get full table metadata, view column definitions and types |
| **Column**   | View data types, constraints, existing descriptions, sensitivity tags   |
| **Lineage**  | Trace upstream sources and downstream consumers for any table           |

## Metadata Enrichment

The agent can update metadata directly in OpenMetadata using structured patch operations:

<CardGroup cols={2}>
  <Card title="Table Descriptions" icon="align-left">
    Add or update table-level descriptions that explain the business purpose and context of each data asset.
  </Card>

  <Card title="Column Descriptions" icon="list">
    Document individual columns with business-friendly explanations — what each field represents and how it should be interpreted.
  </Card>

  <Card title="PII Tags" icon="shield-halved">
    Apply sensitivity classifications to columns containing personal information — emails, SSNs, phone numbers — using OpenMetadata's PII tagging system.
  </Card>

  <Card title="Search & Discovery" icon="magnifying-glass">
    Semantic search across your entire catalog using vector embeddings — find tables by what they contain, not just what they're named.
  </Card>
</CardGroup>

## Data Asset Discovery

Finding the right data asset is the first step in any metadata operation. The Metadata Agent uses **confidence-based discovery** to surface the most relevant assets:

| Confidence         | Threshold         | Behavior                                   |
| ------------------ | ----------------- | ------------------------------------------ |
| **Strong Match**   | > 60% similarity  | Proceeds automatically with the best match |
| **Possible Match** | 40–60% similarity | Presents options and asks you to confirm   |
| **Uncertain**      | \< 40% similarity | Asks you to clarify or refine your request |

Discovery searches across both **vector embeddings** (semantic meaning) and **OpenMetadata catalog** (structured metadata) to find assets that match your intent — even if you don't know the exact table name.

## Dual Catalog Architecture

The Metadata Agent works across two complementary systems:

```mermaid theme={null}
graph LR
    A[Metadata Agent] --> B["OpenMetadata (Catalog)"]
    A --> C["Neo4j (Graph)"]
    B --> D["Table schemas, column definitions, descriptions, tags, lineage"]
    C --> E["Workspace relationships, access control, data asset graph, GraphRAG"]
```

* **OpenMetadata** stores the detailed catalog metadata — schemas, column definitions, descriptions, PII tags, lineage, and quality metrics
* **Neo4j** stores the relationship graph — how data assets connect to workspaces, organizations, and each other, enabling GraphRAG-powered discovery

When the agent updates a description, it writes to both systems — keeping the catalog and the knowledge graph in sync.

<Callout type="info">
  The Metadata Agent works alongside the [Governance Agent](/brightagent/brightagent_workflows/governance) for policy compliance and the [Quality Agent](/brightagent/brightagent_workflows/quality) for data quality checks. Together they keep your data estate documented, governed, and healthy.
</Callout>
