Skip to main content

System Overview

BrightAgent is a multi-agent AI system built on LangGraph that handles end-to-end data operations through natural language. BrightAgent orchestrates specialized agents, each focused on a specific domain of the data lifecycle. Agents access data through the platform’s secure infrastructure — never directly.
BrightAgent Architecture Diagram

BrightAgent (Orchestrator)

The BrightAgent is the central coordinator. Every user query flows through it: The BrightAgent:
  1. Analyzes intent — Determines what the user is asking for and which capabilities are needed
  2. Routes to agents — Selects one or more specialized agents based on the task
  3. Orchestrates workflows — Coordinates multi-step execution where agents hand off results to each other
  4. Aggregates results — Combines outputs from all agents into a coherent response
  5. Maintains conversation context — Tracks state across multi-turn conversations so agents remember what came before

Specialized Agents

Retrieval Agent

Implements GraphRAG (Graph Retrieval Augmented Generation) for intelligent data discovery. Queries Neo4j to find relevant data assets, metadata, and relationships — then surfaces the best matches for the user’s query.

Analyst Agent

Generates and executes SQL queries against Redshift. Performs statistical analysis, creates Jupyter notebooks, and produces insights grounded in actual data — not guesses from training data.

Engineering Agent

Generates dbt transformation models with proper SQL, configurations, and tests. Submits everything as a GitHub PR for human review — nothing gets deployed without approval.

Visualization Agent

Creates interactive Plotly charts and visualizations. Automatically selects chart types based on data characteristics — bar, line, scatter, pie, heatmap — or follows specific user instructions.

Governance Agent

Manages data quality policies, compliance rules, and metadata governance. Tracks and maintains lineage across the entire data estate via Neo4j.

Quality Agent

Runs data quality checks — completeness, accuracy, consistency, and freshness — and surfaces issues proactively. Operates as a background agent monitoring data health continuously.

Metadata Agent

Connects to OpenMetadata via MCP to generate descriptions, understand schemas, enrich catalog metadata with tags and documentation, and track data lineage.

Slack Router Agent (Beta)

Routes Slack messages to BrightAgent, Jira, Notion, Google Drive, and MS Teams via intent classification and MCP integrations. Sub-100ms routing latency.

Data Flow

How a Query Gets Answered

When a user asks a question, here’s what happens end-to-end:

How Agents Access Data

Agents never access your data directly. Every query flows through the platform’s secure infrastructure:
  • Neo4j provides metadata context — what data exists, where it lives, who owns it, how it relates to other data
  • Redshift in your dedicated workspace executes queries via cross-account IAM roles
  • S3 in your organization account stores the actual data — Redshift reads it in place via Spectrum
Agents can only access data that the user’s workspace is authorized for. No exceptions.

Agent Coordination

Parallel Execution

Multiple agents can work simultaneously when tasks are independent. For example, the Retrieval Agent searches for data while the Visualization Agent prepares chart templates — reducing total response time.

Sequential Chaining

Workflows that depend on prior results run step-by-step: Retrieval finds data → Analyst queries it → Visualization charts the results. Each agent receives the output of the previous step.

Context Sharing

Agents share relevant context and intermediate results through LangGraph state. The Analyst Agent knows exactly which data asset the Retrieval Agent found, including schema, location, and access details.

Multi-Turn Conversations

The BrightAgent maintains conversation state across turns. Users can refine results iteratively:
  • “Show me customer data” → Retrieval finds datasets
  • “Filter to California only” → Analyst refines the query using context from the first turn
  • “Chart that as a pie chart” → Visualization uses the analyst’s results

Human-in-the-Loop

Operations that modify your data infrastructure always require human approval:
OperationApproval Mechanism
dbt model generationGitHub PR — your team reviews SQL, tests, and configurations before merging
Jupyter notebook executionCode is presented for review before execution
Governance policy changesExplicit user confirmation required
Schema modificationsUser must approve before any changes are applied
This ensures the AI assists your workflow without making irreversible changes autonomously.

Observability

Every agent interaction is fully traceable:

LangSmith Tracing

Full trace visibility into every agent step — from intent classification through tool calls to response synthesis. Includes latency breakdowns, token usage, and error attribution.

OpenTelemetry Metrics

Agent invocations, latency (p50/p95/p99), error rates, and token usage tracked via OpenTelemetry for operational dashboards and alerting.

Audit Logging

All tool calls, data accessed, SQL generated, and decisions made are logged. Users can inspect exactly what happened behind every response.

Quality Scoring

Every response is scored for relevance and correctness using DeepEval metrics. Quality trends are tracked across releases to catch regressions early.

Deployment

  • LangGraph Cloud — BrightAgent is deployed on LangGraph Cloud for managed orchestration, scaling, and state persistence.
  • MCP Integration — Model Context Protocol provides validated tool execution and external service connectivity (Jira, Notion, Google Drive, OpenMetadata).
  • LLM Providers — Powered by OpenAI and Anthropic models, selected per-agent based on task requirements and cost efficiency.
  • Three Environments — Dev, staging, and production with CI/CD pipelines and evaluation gates before promotion.

Key Architectural Principles

PrincipleHow It’s Implemented
Agent-per-DomainEach agent specializes in one data domain — retrieval, analysis, engineering, visualization — keeping logic focused and maintainable
Graph-Powered ContextNeo4j provides rich metadata context for every interaction — lineage, relationships, schema, ownership — via GraphRAG
Secure by DefaultAll data access flows through cross-account IAM roles. Agents can only reach data the user’s workspace is authorized for
ObservableEvery agent interaction, tool call, and decision is traced via LangSmith and logged for debugging and audit
Human-in-the-LoopIrreversible operations require explicit human approval. AI assists — humans decide
See the evaluation framework for how agent quality is measured, or explore integrations to see what BrightAgent connects to.