Glossary
Short definitions of every LakeSail term used across the docs. Each entry links to the page where the concept is discussed in depth.
A
Account type. A property of a member: managed (the org controls the user's profile — can reset password, force MFA, deactivate) or external (the user self-manages — for consultants and contractors). See Members.
Active (status). Healthy, accepting work. Applies to cloud accounts, networks, clusters, catalogs, and sessions.
Agentic session. A session driven by an LLM agent rather than a human. Planned, not yet shipped. See Concepts.
Authorization policy. A fine-grained grant of specific permissions to a specific principal (member or team). The escape hatch when a role doesn't fit. See Roles & permissions.
B
Bring-your-own VPC. Connecting LakeSail to a VPC you already manage. Not currently supported in the UI; LakeSail provisions networks for you. See Quickstart.
C
Catalog. A pointer to your data — to a Glue catalog, an Iceberg REST endpoint, Unity, OneLake, etc. The catalog's job is to tell LakeSail how to discover and read tables; it never copies data. See Connect a catalog.
Catalog (provisioned). A catalog service LakeSail provisions inside your cloud account, vs. one you configure pointing at an existing service.
Channel (notification). A delivery destination for notifications: email, Slack, webhook, PagerDuty, or Rootly. See Notifications.
Cloud account. A trust relationship between LakeSail and an AWS account you own — concretely, an IAM role with a scoped trust policy that LakeSail can assume on demand. See Security & IAM.
Cluster. A Kubernetes cluster LakeSail provisions inside a network. Where jobs and sessions actually run. Has management nodes (system) and compute nodes (workload). See Set up a cluster.
Compute nodes. Cluster nodes that run your jobs and sessions. Picked per-workload, scaled by Karpenter on demand. Distinct from management nodes.
Concurrency policy. What happens when a scheduled tick fires while a previous run hasn't finished: skip, allow (with a max), or replace. See Scheduling.
Cron expression. A standard 5-field cron string (e.g. 0 2 * * *) defining when a scheduled job fires. See Scheduling.
D
Draft. An in-progress version of a job. Edits go into a draft so the live version keeps running unchanged; publishing the draft makes it the new live version. See Defining jobs.
Driver / Executor. Spark roles. The driver coordinates a job; executors run the actual work in parallel. Both are configured per-job. See Defining jobs.
E
External ID. A random secret embedded in the cloud account's IAM trust policy. LakeSail must present it to assume the role. Prevents confused-deputy attacks. See Security & IAM.
I
IdP (Identity Provider). An external service (Okta, Entra ID, Google Workspace) that authenticates users via SSO. See Single sign-on.
Idle (session). A session with no active client traffic. Idle sessions don't immediately free their compute; they eventually close to reclaim resources. See Sessions.
Invitation. A time-limited invite link that turns a user into a member of an organization. See Invite teammates.
J
Job. A reusable, versioned workload — SQL or Python — that runs on a cluster against a catalog. See Defining jobs.
Job run. A single execution of a job. Immutable; re-running creates a new run. See Runs & debugging.
M
Management nodes. Cluster nodes that run the control plane and LakeSail's own services. Sized once at cluster creation. Distinct from compute nodes.
Member. The org-scoped link between a user and an organization. Permissions, ownership, and audit attach to the member. See Members.
Memory catalog. An ephemeral, in-memory catalog. Useful for testing and one-off workloads.
MFA (Multi-Factor Authentication). Time-based one-time codes from an authenticator app. Can be required at the org level. See MFA.
Missed-schedule policy. What happens if scheduled ticks were missed (cluster down, platform incident): latest fires once on recovery, all backfills every missed tick. See Scheduling.
N
Network. A VPC LakeSail provisions inside a cloud account. Stable boundary that hosts clusters. One cloud account → many networks. See Quickstart.
Notification rule. Connects an event type (e.g. job_run.status.failed) at a given scope (resource/team/organization/own) to one or more channels. See Notifications.
O
Operation (long-running). An async backend operation (provisioning, destroying) tracked by an operation ID. Surfaced in the UI as progress bars.
Organization. The tenant — every other object in LakeSail belongs to exactly one. See Concepts.
Organization role. A pre-defined permission bundle that applies across the entire org. The catalog is fixed by the platform; you assign existing roles, you don't create new ones. See Roles & permissions.
Owning team. The single team that "owns" a resource (job, query, catalog). Determines who can edit and re-share. Distinct from teams a resource is shared with.
P
Pause. A job state. The definition is intact; scheduled runs don't fire. Manual runs still work. The first thing to reach for when a job misbehaves.
Permission. A relation like CanManageUsers that grants the ability to perform a specific action on a resource type. Roles and policies are bundles of permissions. See Roles & permissions.
Permissions boundary. A managed IAM policy that caps the effective permissions of LakeSail's role in your AWS account, even if a sub-policy is mis-scoped. See Security & IAM.
Q
Query. A saved SQL statement bound to a catalog. Reusable from sessions (interactive) and jobs (batch). See Queries.
R
Role. A bundle of permissions. LakeSail has organization roles and team roles; both are pre-defined catalogs.
Run (job). See Job run.
S
Saved query. Same as Query.
Session. A live, warm gRPC connection to a cluster, exposing Spark Connect. Where queries and ad-hoc Python code run interactively. See Sessions.
Shared (resource). A team a resource is shared with (in addition to the owning team). Shared teams typically get read/run access without ownership rights.
Snapshot (job source). Inline SQL stored directly on the job, vs. a reference to a saved query. Pinning to a snapshot prevents drift when the underlying query changes. See Defining jobs.
Spark Connect. The gRPC protocol LakeSail sessions speak. PySpark 3.4+, the Scala client, and Go/Rust clients all support it. See Sessions.
T
Team. A group of members with shared permissions and shared resource ownership. The unit of access control past the first few people in an org. See Teams.
Team role. A permission bundle scoped to one team. The same member can hold different team roles on different teams. See Roles & permissions.
Token (session). A short-lived JWT that authenticates a Spark Connect client to a session. Issued by the session owner or an admin. See Sessions.
Trust policy. The portion of an IAM role definition that says who can assume it. LakeSail's role is restricted to a specific principal plus the external ID. See Security & IAM.
U
User. A global identity in LakeSail (one email, one set of credentials). Distinct from member, which is the org-scoped link.
V
Version (job). A frozen snapshot of a job's definition. Every published edit creates a new version; runs pin to the version active at dispatch time. See Defining jobs.
W
Workload boundary. The S3-scope managed policy that caps which buckets LakeSail-provisioned workloads can read and write. Buckets matching lakesail-<connection-id>-* only. See Security & IAM.
Webhook signature. An HMAC-SHA256 over the raw request body, sent in the LakeSail-Signature: sha256=... header. Lets you verify a webhook actually came from LakeSail. See Notifications.