Data Sources
Sail supports various data sources for reading and writing.
You can use the SparkSession.read, DataFrame.write, and DataFrame.writeTo() API to load and save data in different formats. You can also use the CREATE TABLE SQL statement to create a table that refers to a specific data source.
Here is a summary of the supported (✅) and unsupported (❌) data sources for reading and writing data. There are also features that are planned in our roadmap (🚧).
| Format | Read Support | Write Support |
|---|---|---|
| Delta Lake | ✅ | ✅ |
| Iceberg | ✅ | ✅ |
| Files (Parquet) | ✅ | ✅ |
| Files (CSV) | ✅ | ✅ |
| Files (JSON) | ✅ | ✅ |
| Files (Binary) | ✅ | ❌ |
| Files (Text) | ✅ | ✅ |
| Files (Avro) | ✅ | ✅ |
| Python | ✅ | ✅ |
| JDBC | ✅ | 🚧 |
| Hudi | 🚧 | 🚧 |
| Files (ORC) | 🚧 | 🚧 |
