Overview
Every Brighthive workspace gets a dedicated data warehouse deployed in its own AWS account. Redshift Serverless is the primary warehouse, with Snowflake available via Datapiary for organizations that need it.Redshift Serverless
Auto-Scaling
Serverless compute scales automatically based on query workload — no capacity planning or cluster management required.
3 Availability Zones
Deployed across 3 AZs for high availability and fault tolerance within each workspace’s dedicated VPC.
Schema-per-Organization
Each organization’s data lives in its own Redshift schema, providing logical isolation within the shared workspace warehouse.
REST API Access
Lambda-backed REST API enables the platform and BrightAgent to execute queries programmatically against your warehouse.
Cross-Account Data Access
Redshift in your workspace account queries organization data stored in separate AWS accounts using cross-account IAM roles:- OrgDataCatalogRole is an IAM role in each organization’s account that trusts the workspace’s Redshift role.
- Redshift Spectrum queries S3 data directly via external tables — no data copying required.
- Glue Data Catalog provides schema metadata for these external tables.
Redshift Spectrum
Redshift Spectrum enables querying data directly in S3 without loading it into Redshift tables. This is used for:- Querying large datasets that don’t need to be materialized in the warehouse.
- Accessing the latest organization data immediately after upload (via Glue catalog references).
Snowflake (via Datapiary)
For organizations that need Snowflake alongside Redshift, Brighthive provides Snowflake integration through Datapiary:- Organizations can sync data from their S3 data lake to Snowflake.
- Snowflake assumes the OrgDataCatalogRole to access organization S3 data via cross-account IAM.
- DBT Cloud transformations can run against Snowflake in addition to Redshift.
How Data Gets Into Your Warehouse
- Organization uploads data to their S3 data lake.
- Glue crawlers auto-detect the schema and update the Glue Data Catalog.
- Redshift Spectrum creates external tables pointing to the organization’s S3 and Glue catalog.
- Metadata is synced to Neo4j, making the data discoverable by BrightAgent and the webapp.
- Optionally, data is synced to Snowflake for organizations that use it.

