Skip to content

Connect a catalog

A catalog is how LakeSail discovers your tables. It's a pointer — LakeSail doesn't copy your data. This guide walks through connecting each supported provider.

Prerequisites

Depends on the provider:

  • AWS Glue — an active cloud account in the AWS account where the Glue catalog lives. LakeSail authenticates through the cross-account role, so no separate credentials. See Set up a cluster for the cloud-account flow if you don't already have one.
  • Iceberg REST, Unity Catalog, OneLake — just the catalog's URL and an auth token. No LakeSail-side prerequisites.
  • Memory — none.

You don't need a cluster to create a catalog. You'll need one to run anything against the catalog.

Open the catalog modal

  1. Open Catalogs in the sidebar.
  2. Click Create catalog.
  3. Fill in the shared fields:
    • Catalog Name — e.g. production-catalog.
    • Teams — which teams can use this catalog. You can leave empty and add later.
    • Provider — one of the options below. The rest of the form changes based on your choice.

Provider: AWS Glue

Use this when your tables are registered in the AWS Glue Data Catalog in the same cloud account you connected to LakeSail.

  • AWS Region — e.g. us-east-1. Must match the region where your Glue catalog lives.
  • Default Database (optional) — if set, queries that don't qualify a database name use this one. Populates SAIL_CATALOG__DEFAULT_DATABASE for jobs.
  • Custom Endpoint URL (optional) — only if you route Glue through a VPC endpoint or a non-standard URL.

Nothing to paste for credentials — LakeSail uses the IAM role from your cloud account.

Provider: Iceberg REST

Use this for any Iceberg REST Catalog — Tabular, LakeFS, Polaris, a self-hosted Nessie, or the REST-compatible endpoint of a warehouse.

  • Catalog URI — e.g. https://catalog.example.com. The REST endpoint.
  • Warehouse (optional)s3://my-bucket/warehouse. Sets the default warehouse location.
  • Prefix (optional) — path prefix appended to the URI.
  • OAuth Access Token or Bearer Access Token — whichever your catalog uses. If one is already configured and you're editing, leave blank to keep the existing value.

Provider: Unity Catalog

For Databricks Unity Catalog.

  • Server URI — your Unity metastore URL.
  • Default Catalog (optional) — e.g. main.
  • Personal Access Token — the PAT the catalog authenticates with. Leave blank on edit to keep the existing one.

Provider: OneLake (Microsoft Fabric)

  • OneLake URLhttps://<account>.dfs.fabric.microsoft.com.
  • Bearer Token — auth token for the Fabric workspace.

Provider: Memory

An in-memory catalog, useful for ephemeral workloads and testing. Supply only:

  • Initial Database — e.g. default.
  • Database Comment (optional).

After creating

The catalog appears in the Catalogs list. Jobs and queries can now select it. Tables inside the catalog show up automatically — LakeSail doesn't require you to register them individually.

Provisioned vs. configured catalogs

The providers above are configured catalogs — you already have the underlying service, and LakeSail connects to it. If you want LakeSail to provision a fresh data-catalog service inside your AWS account, use Settings → Cloud Catalogs instead. The lifecycle there mirrors networks and clusters (pending → provisioning → active).

API reference

  • Catalogs — create, describe, update, delete catalogs and their team assignments.