Data Formats
Sail supports various data formats for reading and writing.
You can use the SparkSession.read
, DataFrame.write
, and DataFrame.writeTo()
API to load and save data in different formats. You can also use the CREATE TABLE
SQL statement to create a table that refers to data stored in a specific format.
Here is a summary of the supported (✅) and unsupported (❌) data formats for reading and writing data. There are also features that are planned in our roadmap (🚧).
Format | Read Support | Write Support |
---|---|---|
CSV | ✅ | ✅ |
JSON | ✅ | ✅ |
Parquet | ✅ | ✅ |
Text | 🚧 | 🚧 |
Binary | 🚧 | 🚧 |
Avro | 🚧 | 🚧 |
Protocol Buffers | 🚧 | 🚧 |
ORC | ❌ | ❌ |
Delta Lake | ✅ | ✅ |
Iceberg | 🚧 | 🚧 |
Hudi | 🚧 | 🚧 |