๐Ÿงฑ How the Stack Works Together

Our platform leverages a modular, cloud-native architecture designed to help data-driven organizations ingest, govern, transform, and explore their dataโ€”through a seamless UI and a powerful conversational interface.


๐ŸŒ Unified Orchestration via GraphQL API

At the heart of the system is a GraphQL API, deployed serverlessly on AWS Lambda, acting as the central orchestrator.

  • Exposes a unified API to the frontend.
  • Coordinates ingestion with Airbyte.
  • Enables integrates with Snowflake / Redshift.
  • Integrates with OpenMetadata.
  • Integrates with LangChain (BrightBot)
  • Handles global authentication, command routing, and workflow abstraction.
  • Provides endpoints supporting different clients.

๐Ÿง  Metadata & Governance with Neo4j

Our Neo4j knowledge graph acts as the control plane, representing all platform entities: data sources, files, pipelines, warehouses, and transformations.

  • Powers the backend operations as well as different services.
  • Supports Cypher queries for consumption by the backend / BrightBot.
  • Enables semantic search and natural language queries via BrightBot.
  • Serves as the metadata registry for:
    • Data Catalog.
    • Ingestion pipelines (Airbyte).
    • Warehousing (Redshift, Snowflake).
    • BrightBot agents.

๐Ÿ”Œ Seamless Ingestion

Ingestion is powered by a self-hosted Airbyte instance (on EC2).

  • Coordinates with backend (Apollo) for metadata storage.
  • Configures sources/connections via UI.
  • Schedules and monitors syncs to S3.
  • Automatically maps to assets into Neo4j.

This integration ensures ingestion is scalable, trackable, and governed from the start.


๐Ÿ”„ Transformations with DBT Cloud

Once data lands in Redshift or Snowflake, itโ€™s transformed via DBT Cloud.

  • Analysts use DBTโ€™s transformation-as-code model.
  • Models and jobs are registered in Neo4j.

๐ŸงŠ Warehousing

We support Amazon Redshift and Snowflake as analytical destinations.

  • Warehouse configurations are UI-driven.
  • Schemas are indexed in Neo4j.
  • Enables:
    • High-performance analytics.
    • Cross-source joins
    • Governed, traceable storage

๐Ÿงพ File Uploads & Data Catalog

Users can upload structured and unstructured files (CSV, PDF, images, videos) directly via the platform.

  • Enabled through backend (Apollo).
  • Accessible from the UI.
  • Files are stored in Amazon S3.
  • Automatically indexed in Neo4j.
  • Metadata includes file type, schema, source relationships. These files become searchable, queryable, and tightly integrated into the data discovery experience.

๐Ÿ’ฌ Conversational Intelligence with BrightBot

BrightBot is our AI-powered multi-agent system that interacts primarily with backend (Apollo):

  • Stores and retrieves data to / from Neo4j.
  • Available in the UI.
  • Agents below directly interacting with Apollo endpoints.
    • Supervisor Agent: Parses user intent and delegates tasks.
    • Retrieval Agent: Uses RAG to fetch relevant metadata.
    • Engineering Agent: Generates DBT code.
    • Analytics Agent: Builds Jupyter notebooks.
    • Visualization Agent: Creates charts and dashboards.

๐Ÿงฉ Modular and Extensible by Design

Each system component is decoupled and modular, including:

  • Ingestion: Airbyte.
  • Transformation: DBT.
  • Metadata: Neo4j.
  • Storage: S3, Redshift, Snowflake.
  • Access: GraphQL API, BrightBot.

This design ensures the platform is:

  • Extensible to new tools and sources.
  • Cloud-agnostic and flexibly deployable.
  • Developer-friendly, with schema-based APIs and integrations.