Glossary
Short definitions of every LakeSail term used across the docs. Each entry links to the page where the concept is discussed in depth.
A
Account type. A property of a member: managed (the org controls the user's profile — can reset password, force MFA, deactivate) or external (the user self-manages — for consultants and contractors). See Members.
Active (status). Healthy, accepting work. Applies to cloud accounts, networks, clusters, catalogs, and sessions.
Agentic session. A session driven by an LLM agent rather than a human. Planned, not yet shipped. See Concepts.
Authorization policy. A fine-grained grant of specific permissions to a specific principal (member or team). Used when a role doesn't fit. See Roles & permissions.
B
Bring-your-own VPC. Connecting LakeSail to a VPC you already manage. Not currently supported in the UI; LakeSail provisions networks for you. See Quickstart.
C
Capacity type. Whether worker nodes use on-demand or spot EC2 capacity. Set per compute profile in cluster execution mode. See Compute profiles.
Catalog. A pointer to your data, such as a Glue catalog, an Iceberg REST endpoint, Unity, or OneLake. The catalog tells LakeSail how to discover and read tables; it does not copy data. See Connect a catalog.
Catalog (provisioned). A catalog service LakeSail provisions inside your cloud account, vs. one you configure pointing at an existing service.
Channel (notification). A delivery destination for notifications: email, Slack, webhook, PagerDuty, or Rootly. See Notifications.
Cloud account. A trust relationship between LakeSail and an AWS account you own. Concretely, it is an IAM role with a scoped trust policy that LakeSail can assume on demand. See Security & IAM.
Cluster. A Kubernetes cluster LakeSail provisions inside a network. Jobs and sessions run here. Has management nodes (system) and compute nodes (workload). See Set up a cluster.
Compute nodes. Cluster nodes that run your jobs and sessions. Sized per-workload through a compute profile, scaled by Karpenter on demand. Distinct from management nodes.
Compute profile. A named, reusable bundle of compute and engine settings (execution mode, driver/worker sizing, libraries, env) that a job, session, or notebook runs with. Called a workload config in the API. See Compute profiles.
Concurrency policy. What happens when a scheduled tick fires while a previous run hasn't finished: skip, allow (with a max), or replace. See Scheduling.
Cron expression. A standard 5-field cron string (e.g. 0 2 * * *) defining when a scheduled job fires. See Scheduling.
D
Draft. An in-progress version of a job. Edits go into a draft so the live version keeps running unchanged; publishing the draft makes it the new live version. See Defining jobs.
Driver / Executor. Spark roles. The driver coordinates a job; executors run the actual work in parallel. Both are configured per-job. See Defining jobs.
E
Execution mode. How a compute profile shapes the Sail engine: standalone (a single pod, parallelized by threads) or cluster (a driver plus separate worker pods for distributed execution). See Compute profiles.
External ID. A random secret embedded in the cloud account's IAM trust policy. LakeSail must present it to assume the role, which prevents confused-deputy attacks. See Security & IAM.
I
IdP (Identity Provider). An external service (Okta, Entra ID, Google Workspace) that authenticates users via SSO. See Single sign-on.
Idle (session). A session with no active client traffic. Idle sessions don't immediately free their compute; they eventually close to reclaim resources. See Sessions.
Invitation. A time-limited invite link that turns a user into a member of an organization. See Invite teammates.
J
Job. A reusable, versioned SQL or Python workload that runs on a cluster against a catalog. See Defining jobs.
Job run. A single execution of a job. Immutable; re-running creates a new run. See Runs & debugging.
M
Management nodes. Cluster nodes that run the control plane and LakeSail's own services. Sized once at cluster creation. Distinct from compute nodes.
Member. The org-scoped link between a user and an organization. Permissions, ownership, and audit attach to the member. See Members.
Memory catalog. An ephemeral, in-memory catalog. Useful for testing and one-off workloads.
MFA (Multi-Factor Authentication). Time-based one-time codes from an authenticator app. Can be required at the org level. See MFA.
Missed-schedule policy. What happens if scheduled ticks were missed (cluster down, platform incident): latest fires once on recovery, all backfills every missed tick. See Scheduling.
N
Network. A VPC LakeSail provisions inside a cloud account. The stable boundary that hosts clusters. One cloud account can have many networks. See Manage networks.
Notebook. A hosted, Marimo-based interactive Python notebook backed by a Sail session. The editor and runtime are hosted for you, vs. a session you attach your own client to. See Notebooks.
Notification rule. Connects an event type (e.g. job_run.status.failed) at a given scope (resource/team/organization/own) to one or more channels. See Notifications.
O
Operation (long-running). An async backend operation (provisioning, destroying) tracked by an operation ID. Surfaced in the UI as progress bars.
Organization. The tenant. Every other object in LakeSail belongs to exactly one. See Concepts.
Organization role. A pre-defined permission bundle that applies across the entire org. The catalog is fixed by the platform; you assign existing roles, you don't create new ones. See Roles & permissions.
Owning team. The single team that "owns" a resource (job, query, catalog). Determines who can edit and re-share. Distinct from teams a resource is shared with.
P
Pause. A job state. The definition is intact; scheduled runs don't fire. Manual runs still work. Use this first when a job misbehaves.
Permission. A relation like CanManageUsers that grants the ability to perform a specific action on a resource type. Roles and policies are bundles of permissions. See Roles & permissions.
Permissions boundary. A managed IAM policy that caps the effective permissions of LakeSail's role in your AWS account, even if a sub-policy is mis-scoped. See Security & IAM.
Q
Query. A saved SQL statement bound to a catalog. Reusable from sessions (interactive) and jobs (batch). See Queries.
R
Role. A bundle of permissions. LakeSail has organization roles and team roles; both are pre-defined catalogs.
Run (job). See Job run.
S
Saved query. Same as Query.
Session. A live, warm gRPC connection to a cluster, exposing Spark Connect. Where queries and ad-hoc Python code run interactively. See Sessions.
Shared (resource). A team a resource is shared with (in addition to the owning team). Shared teams typically get read/run access without ownership rights.
Snapshot (job source). Inline SQL stored directly on the job, vs. a reference to a saved query. Pinning to a snapshot prevents drift when the underlying query changes. See Defining jobs.
Spark Connect. The gRPC protocol LakeSail sessions speak. PySpark 3.5+, the Scala client, and Go/Rust clients all support it. See Sessions.
T
Team. A group of members with shared permissions and shared resource ownership. The primary unit of access control once an org grows beyond a handful of people. See Teams.
Team role. A permission bundle scoped to one team. The same member can hold different team roles on different teams. See Roles & permissions.
Token (session). A short-lived JWT that authenticates a Spark Connect client to a session. Issued by the session owner or an admin. See Sessions.
Trust policy. The portion of an IAM role definition that says who can assume it. LakeSail's role is restricted to a specific principal plus the external ID. See Security & IAM.
U
User. A global identity in LakeSail (one email, one set of credentials). Distinct from member, which is the org-scoped link.
V
Version (job). A frozen snapshot of a job's definition. Every published edit creates a new version; runs pin to the version active at dispatch time. See Defining jobs.
W
Workload boundary. The S3-scope managed policy that caps which buckets LakeSail-provisioned workloads can read and write. Buckets matching lakesail-<connection-id>-* only. See Security & IAM.
Webhook signature. An HMAC-SHA256 over the raw request body, sent in the LakeSail-Signature: sha256=... header. Lets you verify a webhook came from LakeSail. See Notifications.
Workload config. The API name for a compute profile. See Compute profiles.