Skip to content

The managed Apache Flink service on the LakeSail platform is deprecated.

LakeSail is building Sail, an open-source computation framework in Rust to seamlessly integrate stream-processing, batch-processing, and compute-intensive (AI) workloads. The LakeSail platform will offer the managed solution for Sail. Existing PySpark and Flink SQL workloads can be migrated with ease. Please stay tuned and contact us if you are interested!

Architecture

The LakeSail platform consists of an API server and a data store. The API server interacts with Kubernetes clusters to manage Flink applications and sessions, and manage organization resources in the data store.

Components

API Server

The API server hosts the REST API and the web console. It connects to Kubernetes clusters registered by organization administrators and manages Flink applications and sessions in the clusters. The API server is released as a single binary and can be deployed to container environments or run locally.

Data Store

The data store keeps track of organization resources such as users, workspaces, and Kubernetes cluster configuration. Note that Flink applications and sessions are represented as Kubernetes custom resources and their states are not stored in the data store.

INFO

  • For Community Edition, we offer an in-memory data store so that you can try out LakeSail locally. The in-memory data store only allows a single user and a single workspace, and does not persist data across server restarts.

  • For Enterprise Edition, we support the PostgreSQL database as the data store.

Kubernetes Cluster

A Kubernetes cluster registered with LakeSail runs a few Helm charts including one for the Flink Kubernetes Operator. The Flink Kubernetes Operator registers custom resource definitions (CRDs) to describe Flink clusters and jobs and manages the lifecycle of them. The LakeSail API interacts with these custom resources via the Kubernetes API.

Deployment Options

The LakeSail architecture allows for flexible deployment options. If you are interested, please contact us to discuss the best option for your use case.

On-premise

You can deploy the API server to your own infrastructure on-premise and connect it to a data store and Kubernetes clusters that you manage. This is suitable for the case where your have isolated networks or even air-gapped environments without Internet access. We offer a license-based pricing model for on-premise deployments.

Bring-your-own-cluster

For bring-your-own-cluster (BYOC) deployments, LakeSail operates the API server and data store in its own cloud infrastructure, and you register your own Kubernetes clusters to the API server. Depending on the cloud provider you use, private network connections may be possible between your clusters and the LakeSail cloud infrastructure, so that no communication goes through the Internet. This is suitable for the case where you want to use LakeSail to manage your Flink applications and sessions, but you want to keep your data and compute resources under your control. We offer a pay-as-you-go pricing model for BYOC deployments.

Fully-managed

For fully-managed deployments, LakeSail operates the API server, data store, and Kubernetes clusters in its own cloud infrastructure. This is suitable for the case where you want to use LakeSail to manage your Flink applications and sessions, and you want to use LakeSail's cloud infrastructure to run your Flink clusters. You can optionally manage data in your own cloud storage. We offer a pay-as-you-go pricing model for fully-managed deployments.