Skip to content

Concepts

LakeSail runs data compute — Spark jobs, SQL queries — inside your own cloud account, against data you already have. To do that safely and predictably, the system is split into a few deliberate layers. This page walks through those layers in the order you'll encounter them, so by the end you should be able to point at any object in the UI and know where it fits.

How the pieces fit together

Organization ─── Users, Teams, Roles          ← who can do what

      ├── Cloud Account    ← AWS credentials (cross-account role)
      │        └── Network     ← a VPC inside that account
      │                └── Cluster   ← Kubernetes compute inside that VPC

      ├── Catalog          ← pointer to your data

      └── Jobs & Queries   ← what actually runs on a cluster against a catalog
                 └── Job Runs & Sessions

Read top-to-bottom: an organization holds everything; infrastructure nests account → network → cluster; data is described by catalogs; and workloads (jobs and queries) are the things you write, which execute on a cluster against a catalog.

A useful lens as you read: almost every object in LakeSail has two halves — a definition you create and manage in the UI, and a lifecycle it moves through (pending → active → destroyed) as the underlying infrastructure catches up. The definition is cheap and instant; the lifecycle is where AWS and Kubernetes do real work.

You and your organization

When you sign up, you create an organization. An organization is the tenant — every other object in LakeSail (cloud accounts, networks, clusters, catalogs, jobs, queries) belongs to exactly one organization. Think of it as the line that separates "us" from "someone else on the platform": billing, access control, and resource namespacing all stop at the org boundary.

Inside the organization, identity has four moving parts:

  • Users are the people. A user can belong to multiple organizations (the same email works across them), but they act in one at a time.
  • Members are the organization-scoped link between a user and an org. Becoming a member happens by accepting an invitation sent by an existing admin.
  • Teams group members together — typically by function (data-eng, analysts) or by project. A member can belong to any number of teams.
  • Roles are what grant actual permissions. Roles attach to a member (giving that person specific rights) or to a team (giving everyone in the team those rights, which is usually easier to reason about).

You'll mostly interact with this layer through two questions: "who's on my team?" and "what can they do?" For a first-time solo user it's invisible — you're the sole member with full access. It becomes load-bearing the moment a second person shows up, because that's when "who can spin up a cluster in production?" stops being rhetorical.

Where your compute lives

LakeSail doesn't run your compute on its own hardware. It runs inside your AWS account, which is why the infrastructure layer has three nested objects instead of one. Each level has a different lifecycle and a different blast radius, so separating them means rotating credentials, resizing compute, and tearing down environments can happen independently.

Cloud account

A cloud account is a trust relationship between LakeSail and an AWS account you own. Concretely, you deploy a CloudFormation stack in your AWS account that creates an IAM role with a scoped trust policy, and you hand LakeSail the resulting role ARN. LakeSail assumes the role on demand — credentials never leave AWS.

Cloud accounts move through a short lifecycle: pending when you paste the ARN, verifying while LakeSail calls sts:AssumeRole to confirm the trust works, then active. If something changes (you delete the role, revoke the trust), the account drops to failed or disconnected and any dependent networks and clusters are effectively orphaned until you repair it. One cloud account can host many networks.

Network

A network is a VPC that LakeSail provisions inside a cloud account. You pick the region and the IPv4 CIDR range (e.g. 10.0.0.0/16); LakeSail creates the VPC, subnets across availability zones, security groups, and the rest of the plumbing. The network is a stable boundary — clusters inside it can talk to each other and to anything you peer the VPC to.

The lifecycle mirrors infrastructure reality: pending → provisioning → active on the way up, destroying → destroyed on the way down. Provisioning is visible because it takes minutes, not seconds. Having networks as their own layer (rather than tying a VPC directly to a cluster) means you can run multiple clusters — e.g. dev and prod, or different sizes for different teams — without re-creating networking, and you can delete a cluster without tearing down the VPC.

One cloud account can host many networks; a network belongs to exactly one cloud account and one region.

Cluster

A cluster is a Kubernetes cluster that LakeSail provisions inside a network. It's where jobs and sessions actually get scheduled. A cluster has two kinds of nodes, and the distinction matters:

  • Management (system) nodes run the cluster's control plane and LakeSail's own services. You size these once, conservatively — the defaults (m8g.large, min 1 / desired 2 / max 3 nodes, 100 GB disk) are enough for evaluation and small teams.
  • Compute nodes run your workloads. They're chosen per-job or per-session, so you don't oversize a cluster to handle a rare big workload — the cluster scales compute on demand.

Cluster lifecycle: pending → provisioning → active, then updating while you resize, and destroying → destroyed on teardown. A failed state means CloudFormation or the Kubernetes bootstrap didn't succeed; the provisioning progress bar shows the stage it stopped at.

A cluster has an access policy — either private-only (reachable only from inside the VPC) or public with a CIDR allowlist. You can change this later without rebuilding the cluster.

Where your data lives

A catalog is LakeSail's handle on your data. It's a pointer — to an Iceberg/Delta table location in S3, to a Glue catalog, to an external metastore, or to a catalog service that LakeSail provisions for you in your cloud account. The data itself stays where it is; the catalog just tells LakeSail how to find and describe it.

This is why catalogs are separate from clusters: the same data can be read by multiple clusters, and the same cluster can read from multiple catalogs. The pairing is picked per-workload.

Like networks and clusters, a provisioned catalog has a lifecycle (pending → provisioning → active → destroying → destroyed) because real AWS resources are being created. A catalog that just points at something you already manage (a bring-your-own configuration) skips most of that — it's effectively just a record in LakeSail.

What actually runs

Workloads come in two shapes, matched to how they're used:

Jobs and job runs

A job is a reusable, versioned definition of something to run — typically a Spark job. A job has:

  • Versions: every edit creates a new version, and you can promote a draft before it becomes the live one. This is how you change a pipeline without losing the ability to reproduce last week's output.
  • A status of active (runs on its schedule) or paused (defined but won't fire). Pausing is cheap and reversible — it's the first thing to reach for when something misbehaves.
  • A team assignment, which ties the job to the roles that control it.

Each execution of a job is a job run — a distinct object with its own logs, metrics, status, and result. Job runs have a detailed lifecycle that reflects what's actually happening: pending → ready → starting → creating_sail → waiting_for_sail → creating_runner → running → succeeded (or failed, cancelled, timeout). If a run is stuck, the status tells you whether it's waiting for compute, starting Sail (the query engine), or executing.

Jobs fit batch and scheduled work where reproducibility and history matter more than latency.

Queries and sessions

A query is a reusable SQL definition — saved, named, and assigned to a team, but not executed on its own. Queries execute inside a session: a live, interactive connection to a cluster that keeps state (warm caches, an open catalog) between runs. The session is why a follow-up query can return in seconds instead of waiting for cold compute each time.

Sessions have their own lifecycle: pending → active → idle → closed. Idle matters — an unused session doesn't immediately free its compute, so you (or an admin) close sessions to reclaim resources. A session issues short-lived tokens that external tools (SQL clients, BI tools, notebooks) use to connect.

The query/session split mirrors how people actually work: interactive analysis wants low latency and shared state, so the session is the thing that lives, and queries are the individual requests you send against it.

Jobs vs. queries in one sentence

Jobs are for pipelines — write once, run many times on a schedule. Queries in sessions are for exploration — type, run, iterate. Both ultimately land on the same clusters, reading from the same catalogs, governed by the same roles.

Putting it together

A complete mental model of a LakeSail workspace:

An organization of users and teams connects one or more cloud accounts, provisions networks and clusters inside them, registers catalogs pointing at their data, and runs jobs (batch) and queries in sessions (interactive) on those clusters against those catalogs.

A useful diagnostic, when something in the UI isn't doing what you expect: ask which layer you're looking at. A failing job run is a workload problem. A cluster that won't provision is an infrastructure problem — usually one layer down, in the network or cloud account. A permission denial is an identity problem. The layers are the seams the rest of the product is built along.