🧱 How the Stack Works Together
Our platform leverages a modular, cloud-native architecture designed to help data-driven organizations ingest, govern, transform, and explore their data—through a seamless UI and a powerful conversational interface.🌐 Unified Orchestration via GraphQL API
At the heart of the system is a GraphQL API, deployed serverlessly on AWS Lambda, acting as the central orchestrator.- Exposes a unified API to the frontend.
- Coordinates ingestion with Airbyte.
- Enables integrates with Snowflake / Redshift.
- Integrates with OpenMetadata.
- Integrates with LangChain (BrightAgent)
- Handles global authentication, command routing, and workflow abstraction.
- Provides endpoints supporting different clients.
🧠 Metadata & Governance with Neo4j
Our Neo4j knowledge graph acts as the control plane, representing all platform entities: data sources, files, pipelines, warehouses, and transformations.- Powers the backend operations as well as different services.
- Supports Cypher queries for consumption by the backend / BrightAgent.
- Enables semantic search and natural language queries via BrightAgent.
- Serves as the metadata registry for:
- Data Catalog.
- Ingestion pipelines (Airbyte).
- Warehousing (Redshift, Snowflake).
- BrightAgent agents.
🔌 Seamless Ingestion
Ingestion is powered by a self-hosted Airbyte instance (on EC2).- Coordinates with backend (Apollo) for metadata storage.
- Configures sources/connections via UI.
- Schedules and monitors syncs to S3.
- Automatically maps to assets into Neo4j.
🔄 Transformations with DBT Cloud
Once data lands in Redshift or Snowflake, it’s transformed via DBT Cloud.- Analysts use DBT’s transformation-as-code model.
- Models and jobs are registered in Neo4j.
🧊 Warehousing
We support Amazon Redshift and Snowflake as analytical destinations.- Warehouse configurations are UI-driven.
- Schemas are indexed in Neo4j.
- Enables:
- High-performance analytics.
- Cross-source joins
- Governed, traceable storage
🧾 File Uploads & Data Catalog
Users can upload structured and unstructured files (CSV, PDF, images, videos) directly via the platform.- Enabled through backend (Apollo).
- Accessible from the UI.
- Files are stored in Amazon S3.
- Automatically indexed in Neo4j.
- Metadata includes file type, schema, source relationships. These files become searchable, queryable, and tightly integrated into the data discovery experience.
💬 Conversational Intelligence with BrightAgent
BrightAgent is our AI-powered multi-agent system that interacts primarily with backend (Apollo):- Stores and retrieves data to / from Neo4j.
- Available in the UI.
- Agents below directly interacting with Apollo endpoints.
- Supervisor Agent: Parses user intent and delegates tasks.
- Retrieval Agent: Uses RAG to fetch relevant metadata.
- Engineering Agent: Generates DBT code.
- Analytics Agent: Builds Jupyter notebooks.
- Visualization Agent: Creates charts and dashboards.
🧩 Modular and Extensible by Design
Each system component is decoupled and modular, including:- Ingestion: Airbyte.
- Transformation: DBT.
- Metadata: Neo4j.
- Storage: S3, Redshift, Snowflake.
- Access: GraphQL API, BrightAgent.
- Extensible to new tools and sources.
- Cloud-agnostic and flexibly deployable.
- Developer-friendly, with schema-based APIs and integrations.

