SQream Blue Documentation

SQream Blue is a cloud-native fully-managed data lakehouse built for fast, reliable, and cost-effective data processing utilizing a patented GPU-acceleration engine. The platform enables easy data preparation and transformation from and to the data lake, for faster analytics and AI/ML.

Capabilities

Architecture

SQream Blue utilizes direct access to data in open-standard formats, eliminating the need for data ingestion or movement. Data remains in the customer’s low-cost cloud storage throughout the preparation cycle, ensuring privacy, ownership, and a single source of truth, while eliminating data duplication.

Parallelism

SQream Blue uses the GPU to achieve parallel data processing. By breaking large tasks into smaller processes, SQream Blue distributes operations across multiple GPU cores, allowing administrators to balance parallelism and concurrency according to their business needs.

Connectivity

SQream Blue easily integrates with common open-source workflow management and orchestration tools, such as Apache Airflow, Dgaster, and Prefect. It also supports industry-standard JDBC and Python connectors, and provides a REST API for cluster management.

Optimizations

Optimized for Apache Parquet

SQream Blue’s processing engine utilizes Parquet’s column-oriented structure and metadata to avoid unnecessary data reads, resulting in optimized processing times.

GPU Optimization Engine

SQream Blue’s performance relies on a patented GPU acceleration technology that synchronizes all available resources (CPU, GPU, RAM) and utilizes the GPU’s processing power for even the most complex analytical tasks.