> ## Documentation Index
> Fetch the complete documentation index at: https://docs.brighthive.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Integrations

> BrightAgent connects to your entire data stack — warehouse, catalog, transformations, and collaboration tools — through a secure, unified integration layer.

## How BrightAgent Connects

BrightAgent integrates with the services that power your data stack through the **Datapiary** library, providing a uniform interface across all service types. Every interaction is authenticated, scoped to your workspace, and tracked in Neo4j for full lineage and auditability.

```mermaid theme={null}
graph TD
    A[BrightAgent] --> B[Datapiary Integration Layer]
    B --> C[Warehouse Services]
    B --> D[Catalog Services]
    B --> E[Transformation Services]
    B --> F[Ingestion Services]
    B --> G[Collaboration Services]
    B --> H[External Tools via MCP]
    C --> C1[Redshift Serverless]
    C --> C2[Snowflake]
    D --> D1[Neo4j]
    D --> D2[Glue Data Catalog]
    D --> D3[OpenMetadata]
    E --> E1[DBT Cloud]
    F --> F1[S3 Data Lake]
    F --> F2[Airbyte]
    G --> G1[Stream.io]
    H --> H1[Jira]
    H --> H2[Notion]
    H --> H3[Google Drive]
```

## Core Integrations

<CardGroup cols={2}>
  <Card title="Neo4j" icon="share-nodes">
    **Metadata & Knowledge Graph** — Single source of truth for all metadata, lineage, relationships, and data asset information. Every agent queries Neo4j for context via GraphRAG. Stores user, workspace, organization relationships; data asset schemas and locations; transformation lineage; and access control metadata.
  </Card>

  <Card title="Redshift Serverless" icon="warehouse">
    **Data Warehouse** — Auto-scaling analytical warehouse deployed across 3 availability zones with schema-per-organization isolation. The Analyst Agent generates and executes SQL here. Queries access organization data via cross-account IAM and Redshift Spectrum — reading S3 in place without copying.
  </Card>

  <Card title="Snowflake" icon="snowflake">
    **Data Warehouse (via Datapiary)** — Available for organizations that need Snowflake alongside Redshift. Data syncs from organization S3 to Snowflake via cross-account IAM. DBT Cloud transformations can run against either warehouse.
  </Card>

  <Card title="AWS Glue" icon="book-open">
    **Schema Discovery** — Glue crawlers automatically detect schemas when data lands in S3 — inferring column names, data types, partitions, and formats. Metadata is synced to Neo4j and made available to all agents immediately.
  </Card>

  <Card title="DBT Cloud" icon="arrows-rotate">
    **Data Transformation (via Datapiary)** — The Engineering Agent generates dbt models that run on DBT Cloud. All generated code goes through GitHub PRs for human review. Neo4j tracks transformation lineage — which models depend on which sources.
  </Card>

  <Card title="Amazon S3" icon="hard-drive">
    **Data Lake Storage** — Each organization gets dedicated S3 buckets (raw, staged, shared) in their own AWS account. File uploads trigger automatic schema discovery via Glue and metadata registration in Neo4j.
  </Card>

  <Card title="Airbyte" icon="plug">
    **Data Ingestion (Optional)** — Self-hosted Airbyte instance with 300+ connectors for ingesting data from external sources like Shopify, HubSpot, Salesforce, PostgreSQL, and hundreds more. Runs within the organization's dedicated AWS account.
  </Card>

  <Card title="OpenMetadata" icon="table">
    **Metadata Catalog** — Unified metadata catalog integration for comprehensive data asset discovery, documentation, and lineage tracking. Connected via MCP for direct agent access.
  </Card>
</CardGroup>

## MCP Integrations (Model Context Protocol)

BrightAgent uses MCP for validated access to external tools and services. MCP ensures that every tool call is well-formed, authorized, and auditable before execution.

<CardGroup cols={2}>
  <Card title="Jira" icon="ticket">
    Create tickets, update statuses, and manage sprints directly from BrightAgent or Slack. The Slack Router Agent routes Jira-related requests to the Jira MCP server.
  </Card>

  <Card title="Notion" icon="book">
    Search pages, query databases, and retrieve documentation from Notion workspaces. Integrated as an MCP server for structured access.
  </Card>

  <Card title="Google Drive" icon="google-drive">
    Search and retrieve documents from Google Drive. Available through the Slack Router Agent for quick access from Slack conversations.
  </Card>

  <Card title="OpenMetadata" icon="database">
    Direct MCP connection to OpenMetadata for metadata discovery, data quality information, and catalog operations beyond what's stored in Neo4j.
  </Card>
</CardGroup>

## Observability & Tracing

<CardGroup cols={2}>
  <Card title="LangSmith" icon="route">
    Full distributed tracing for every agent interaction — from initial user query through intent classification, tool calls, and response synthesis. Traces include latency breakdowns per agent, token usage by model, and error attribution.
  </Card>

  <Card title="OpenTelemetry" icon="chart-mixed">
    Evaluation metrics, agent invocation counts, latency percentiles (p50/p95/p99), and error rates are recorded via OpenTelemetry for operational dashboards and alerting.
  </Card>
</CardGroup>

## Integration Architecture

BrightAgent doesn't connect to data services directly from the AI layer. Instead, all access flows through the platform's secure infrastructure:

```mermaid theme={null}
graph LR
    A[BrightAgent] -->|"Authenticated Request"| B[Platform API]
    B -->|"JWT Validation"| C[Cognito Auth]
    B -->|"Metadata Lookup"| D[Neo4j]
    B -->|"Cross-Account IAM"| E[Customer Infrastructure]
    E --> F["Redshift (Workspace Account)"]
    E --> G["S3 (Organization Account)"]
    E --> H["Glue Catalog (Organization Account)"]
```

This architecture means:

* **All access is authenticated** via Cognito JWT tokens — every request is verified before reaching any backend service
* **All queries respect workspace boundaries** — agents can only access data the user's workspace is authorized for
* **All interactions are logged** in Neo4j for lineage and audit — you can trace exactly what data was accessed and why
* **No credentials are shared** — cross-account access uses IAM role assumption (AWS STS), not stored passwords or API keys

## Service Categories

The Datapiary library organizes integrations into service types, providing a consistent interface regardless of the underlying technology:

| Category           | Services                               | What Agents Use Them For                                             |
| ------------------ | -------------------------------------- | -------------------------------------------------------------------- |
| **Warehouse**      | Redshift Serverless, Snowflake         | Executing SQL queries, running analysis, aggregating data            |
| **Catalog**        | Neo4j, Glue Data Catalog, OpenMetadata | Discovering data assets, understanding schemas, tracking lineage     |
| **Transformation** | DBT Cloud                              | Generating and running data transformation models                    |
| **Ingestion**      | S3 direct upload, Airbyte              | Bringing data into the platform from files and external sources      |
| **Notebook**       | Jupyter (E2B sandbox)                  | Generating and executing analysis notebooks in isolated environments |
| **Collaboration**  | Stream.io                              | Real-time team chat and collaboration within the platform            |
| **External Tools** | Jira, Notion, Google Drive, MS Teams   | Task management, documentation, and file access via MCP              |

<Callout type="info">
  Learn about the [platform infrastructure](/platform/backend) that powers these integrations, or see the [security model](/platform/security) for how data isolation and access control work.
</Callout>
