> ## Documentation Index > Fetch the complete documentation index at: https://docs.brighthive.io/llms.txt > Use this file to discover all available pages before exploring further. # Integrations > BrightAgent connects to your entire data stack — warehouse, catalog, transformations, and collaboration tools — through a secure, unified integration layer. ## How BrightAgent Connects BrightAgent integrates with the services that power your data stack through the **Datapiary** library, providing a uniform interface across all service types. Every interaction is authenticated, scoped to your workspace, and tracked in Neo4j for full lineage and auditability. ```mermaid theme={null} graph TD A[BrightAgent] --> B[Datapiary Integration Layer] B --> C[Warehouse Services] B --> D[Catalog Services] B --> E[Transformation Services] B --> F[Ingestion Services] B --> G[Collaboration Services] B --> H[External Tools via MCP] C --> C1[Redshift Serverless] C --> C2[Snowflake] D --> D1[Neo4j] D --> D2[Glue Data Catalog] D --> D3[OpenMetadata] E --> E1[DBT Cloud] F --> F1[S3 Data Lake] F --> F2[Airbyte] G --> G1[Stream.io] H --> H1[Jira] H --> H2[Notion] H --> H3[Google Drive] ``` ## Core Integrations **Metadata & Knowledge Graph** — Single source of truth for all metadata, lineage, relationships, and data asset information. Every agent queries Neo4j for context via GraphRAG. Stores user, workspace, organization relationships; data asset schemas and locations; transformation lineage; and access control metadata. **Data Warehouse** — Auto-scaling analytical warehouse deployed across 3 availability zones with schema-per-organization isolation. The Analyst Agent generates and executes SQL here. Queries access organization data via cross-account IAM and Redshift Spectrum — reading S3 in place without copying. **Data Warehouse (via Datapiary)** — Available for organizations that need Snowflake alongside Redshift. Data syncs from organization S3 to Snowflake via cross-account IAM. DBT Cloud transformations can run against either warehouse. **Schema Discovery** — Glue crawlers automatically detect schemas when data lands in S3 — inferring column names, data types, partitions, and formats. Metadata is synced to Neo4j and made available to all agents immediately. **Data Transformation (via Datapiary)** — The Engineering Agent generates dbt models that run on DBT Cloud. All generated code goes through GitHub PRs for human review. Neo4j tracks transformation lineage — which models depend on which sources. **Data Lake Storage** — Each organization gets dedicated S3 buckets (raw, staged, shared) in their own AWS account. File uploads trigger automatic schema discovery via Glue and metadata registration in Neo4j. **Data Ingestion (Optional)** — Self-hosted Airbyte instance with 300+ connectors for ingesting data from external sources like Shopify, HubSpot, Salesforce, PostgreSQL, and hundreds more. Runs within the organization's dedicated AWS account. **Metadata Catalog** — Unified metadata catalog integration for comprehensive data asset discovery, documentation, and lineage tracking. Connected via MCP for direct agent access. ## MCP Integrations (Model Context Protocol) BrightAgent uses MCP for validated access to external tools and services. MCP ensures that every tool call is well-formed, authorized, and auditable before execution. Create tickets, update statuses, and manage sprints directly from BrightAgent or Slack. The Slack Router Agent routes Jira-related requests to the Jira MCP server. Search pages, query databases, and retrieve documentation from Notion workspaces. Integrated as an MCP server for structured access. Search and retrieve documents from Google Drive. Available through the Slack Router Agent for quick access from Slack conversations. Direct MCP connection to OpenMetadata for metadata discovery, data quality information, and catalog operations beyond what's stored in Neo4j. ## Observability & Tracing Full distributed tracing for every agent interaction — from initial user query through intent classification, tool calls, and response synthesis. Traces include latency breakdowns per agent, token usage by model, and error attribution. Evaluation metrics, agent invocation counts, latency percentiles (p50/p95/p99), and error rates are recorded via OpenTelemetry for operational dashboards and alerting. ## Integration Architecture BrightAgent doesn't connect to data services directly from the AI layer. Instead, all access flows through the platform's secure infrastructure: ```mermaid theme={null} graph LR A[BrightAgent] -->|"Authenticated Request"| B[Platform API] B -->|"JWT Validation"| C[Cognito Auth] B -->|"Metadata Lookup"| D[Neo4j] B -->|"Cross-Account IAM"| E[Customer Infrastructure] E --> F["Redshift (Workspace Account)"] E --> G["S3 (Organization Account)"] E --> H["Glue Catalog (Organization Account)"] ``` This architecture means: * **All access is authenticated** via Cognito JWT tokens — every request is verified before reaching any backend service * **All queries respect workspace boundaries** — agents can only access data the user's workspace is authorized for * **All interactions are logged** in Neo4j for lineage and audit — you can trace exactly what data was accessed and why * **No credentials are shared** — cross-account access uses IAM role assumption (AWS STS), not stored passwords or API keys ## Service Categories The Datapiary library organizes integrations into service types, providing a consistent interface regardless of the underlying technology: | Category | Services | What Agents Use Them For | | ------------------ | -------------------------------------- | -------------------------------------------------------------------- | | **Warehouse** | Redshift Serverless, Snowflake | Executing SQL queries, running analysis, aggregating data | | **Catalog** | Neo4j, Glue Data Catalog, OpenMetadata | Discovering data assets, understanding schemas, tracking lineage | | **Transformation** | DBT Cloud | Generating and running data transformation models | | **Ingestion** | S3 direct upload, Airbyte | Bringing data into the platform from files and external sources | | **Notebook** | Jupyter (E2B sandbox) | Generating and executing analysis notebooks in isolated environments | | **Collaboration** | Stream.io | Real-time team chat and collaboration within the platform | | **External Tools** | Jira, Notion, Google Drive, MS Teams | Task management, documentation, and file access via MCP | Learn about the [platform infrastructure](/platform/backend) that powers these integrations, or see the [security model](/platform/security) for how data isolation and access control work.