Skip to main content

🧱 How the Stack Works Together

Our platform leverages a modular, cloud-native architecture designed to help data-driven organizations ingest, govern, transform, and explore their data—through a seamless UI and a powerful conversational interface.

🌐 Unified Orchestration via GraphQL API

At the heart of the system is a GraphQL API, deployed serverlessly on AWS Lambda, acting as the central orchestrator.
  • Exposes a unified API to the frontend.
  • Coordinates ingestion with Airbyte.
  • Enables integrates with Snowflake / Redshift.
  • Integrates with OpenMetadata.
  • Integrates with LangChain (BrightAgent)
  • Handles global authentication, command routing, and workflow abstraction.
  • Provides endpoints supporting different clients.

🧠 Metadata & Governance with Neo4j

Our Neo4j knowledge graph acts as the control plane, representing all platform entities: data sources, files, pipelines, warehouses, and transformations.
  • Powers the backend operations as well as different services.
  • Supports Cypher queries for consumption by the backend / BrightAgent.
  • Enables semantic search and natural language queries via BrightAgent.
  • Serves as the metadata registry for:
    • Data Catalog.
    • Ingestion pipelines (Airbyte).
    • Warehousing (Redshift, Snowflake).
    • BrightAgent agents.

🔌 Seamless Ingestion

Ingestion is powered by a self-hosted Airbyte instance (on EC2).
  • Coordinates with backend (Apollo) for metadata storage.
  • Configures sources/connections via UI.
  • Schedules and monitors syncs to S3.
  • Automatically maps to assets into Neo4j.
This integration ensures ingestion is scalable, trackable, and governed from the start.

🔄 Transformations with DBT Cloud

Once data lands in Redshift or Snowflake, it’s transformed via DBT Cloud.
  • Analysts use DBT’s transformation-as-code model.
  • Models and jobs are registered in Neo4j.

🧊 Warehousing

We support Amazon Redshift and Snowflake as analytical destinations.
  • Warehouse configurations are UI-driven.
  • Schemas are indexed in Neo4j.
  • Enables:
    • High-performance analytics.
    • Cross-source joins
    • Governed, traceable storage

🧾 File Uploads & Data Catalog

Users can upload structured and unstructured files (CSV, PDF, images, videos) directly via the platform.
  • Enabled through backend (Apollo).
  • Accessible from the UI.
  • Files are stored in Amazon S3.
  • Automatically indexed in Neo4j.
  • Metadata includes file type, schema, source relationships. These files become searchable, queryable, and tightly integrated into the data discovery experience.

💬 Conversational Intelligence with BrightAgent

BrightAgent is our AI-powered multi-agent system that interacts primarily with backend (Apollo):
  • Stores and retrieves data to / from Neo4j.
  • Available in the UI.
  • Agents below directly interacting with Apollo endpoints.
    • Supervisor Agent: Parses user intent and delegates tasks.
    • Retrieval Agent: Uses RAG to fetch relevant metadata.
    • Engineering Agent: Generates DBT code.
    • Analytics Agent: Builds Jupyter notebooks.
    • Visualization Agent: Creates charts and dashboards.

🧩 Modular and Extensible by Design

Each system component is decoupled and modular, including:
  • Ingestion: Airbyte.
  • Transformation: DBT.
  • Metadata: Neo4j.
  • Storage: S3, Redshift, Snowflake.
  • Access: GraphQL API, BrightAgent.
This design ensures the platform is:
  • Extensible to new tools and sources.
  • Cloud-agnostic and flexibly deployable.
  • Developer-friendly, with schema-based APIs and integrations.