Data Formats
Sail supports various data formats for reading and writing.
You can use the SparkSession.read, DataFrame.write, and DataFrame.writeTo() API to load and save data in different formats. You can also use the CREATE TABLE SQL statement to create a table that refers to data stored in a specific format.
Here is a summary of the supported (✅) and unsupported (❌) data formats for reading and writing data. There are also features that are planned in our roadmap (🚧).
| Format | Read Support | Write Support |
|---|---|---|
| Delta Lake | ✅ (partial) | ✅ (partial) |
| Iceberg | ✅ (partial) | ✅ (partial) |
| Parquet | ✅ | ✅ |
| Binary (any file type) | ✅ | ❌ |
| CSV | ✅ | ✅ |
| JSON | ✅ | ✅ |
| Text | ✅ | ✅ |
| Avro | ✅ | ✅ |
| Protocol Buffers | 🚧 | 🚧 |
| Hudi | 🚧 | 🚧 |
| ORC | ❌ | ❌ |
