Brightbot Interactive Analysis

Brightbot’s high level architecture workflow is as follows:

The image illustrates the “Brightbot Interactive Analysis” system’s architecture, showing the data flow and interaction between its components.

Query

The query is in the form of a natural language question posed by the user. It serves as the starting point for the entire process.

Brightbot API

Provides an interface for interacting with the system, allowing users to send queries and receive responses.

Supervisor Agent

Responsible for parsing the query and passing the request to the appropriate agent(s).

Retrieval Agent

Implements Retrieval Augmented Generation (RAG) to fetch relevant data from the knowledge graph and data catalog.

Knowledge Graph

A structured representation of knowledge that refines the user query and feeds into the Prompt + Context component.

Data Catalog

Provides metadata about the knowledge content, also feeding into the Prompt + Context component.

Prompt + Context

Receives input from the Knowledge Graph, Data Catalog, and Supervisor Agent, and sends it to the appropriate agent.

Analytics Agent

Generates a Jupyter notebook identifying key trends in the data and providing insights.

Visualization Agent

Generates visualizations in the form of charts.

Engineering Agent

Generates DBT code for data cleaning and transformation.

Multimodal Output/Response

Depending on the type of query, the system can produce different outputs, including:

  • Jupyter notebooks (from the Analytics Agent)
  • Charts (from the Visualization Agent)
  • DBT code (from the Engineering Agent)
  • Textual responses (from the Retrieval Agent)

The architecture outlines a closed-loop system where a question initiates the process, leading to the generation of analysis through various data sources and processing by the LLM embedded inside the agents. The system includes a quality assurance step through the a review step in the Retrieval Agents, which ensures the output’s accuracy. The Brightbot API orchestrates the flow of data and interactions across the system.