Changelog

0.5.3

March 21, 2026

Added a CLI command to run PySpark scripts (#1535).
Added a system table to expose the application configuration (#1464).
Added support for emitting execution metrics on a configured interval (#1465).
Added support for resolving the default table location using the database location (#1492).
Improved Delta Lake integration (#1346, #1461, #1477, #1500, #1504, #1530, #1532, and #1534).
Improved Iceberg integration (#1183 and #1512).
Improved the internals of catalog management (#1466, #1471, and #1479).
Added support for the following SQL functions (#1470):
- make_time
- time_diff
- time_trunc
Improved the following SQL functions (#1517 and #1536):
- soundex
- array_min
- array_max
Added support for the path option when using the save() method in the Spark DataFrame writer (#1533).
Improved namespace handling for catalog providers (#1484).
Fixed an issue with COUNT(DISTINCT *) in SQL queries (#1394).
Fixed an issue with non-literal expressions in window RANGE frame boundaries (#1482).
Fixed issues with using SQL queries with Python data sources (#1469).
Fixed an issue with the saveAsTable() method when using mode="append" for non-existing tables in the Spark DataFrame API (#1525).
Fixed the SQL syntax for the CREATE DATABASE statement (#1480).
Fixed an issue with large gRPC error messages (#1488).
Fixed issues in the query planner (#1473 and #1483).

Contributors

Huge thanks to @davidlghellin and @tamirkifle for your contributions!

0.5.2

March 2, 2026

Added support for the TIME data type for Spark 4.0 (#1358).
Added writer support and improved reader support in the Python Data Source API (#1378 and #1438).
Added support for the JDBC data source (#1379, #1389, and #1446).
Added support for the SELECT * FROM <format>.<path> SQL query syntax (#1439).
Added support for the IDENTIFIER clause in SQL queries (#1441).
Added support for passing DataFrame as arguments to SQL queries in PySpark (#1426).
Improved Delta Lake integration (#1352, #1359, and #1415).
Improved the join reorder optimizer (#1248).
Added support for the following SQL functions (#1358 and #1398):
- try_make_timestamp
- try_make_timestamp_ltz
- try_make_timestamp_ntz
- st_asbinary
- st_geomfromwkb
- st_geogfromwkb
Improved the following SQL functions (#1308, #1358, #1387, #1390, #1392, #1393, and #1408):
- from_csv
- trim
- ltrim
- rtrim
- cast
- parse_url
- try_parse_url
- hash
- xxhash64
- make_timestamp
- make_timestamp_ltz
- make_timestamp_ntz
Added SQL parsing support for the DROP FUNCTION statement (#1433).
Improved ANSI mode division-by-zero handling in SQL queries (#1304).
Fixed an issue with the range table function in SQL queries (#1423).
Fixed an issue with struct field extraction in SQL queries (#1386).
Fixed issues with start and end fields in the interval data type (#1388).

Contributors

Huge thanks to @tamirkifle(first-time contributor), @davidlghellin, @santosh-d3vpl3x, @zemin-piao, @james-willis, and @pomykalakyle for your contributions!

0.5.1

February 15, 2026

Added support for DataFrame.mergeInto() in the Spark DataFrame API (#1273).
Added support for the TABLESAMPLE clause for SQL queries (#1332).
Added support for the PySpark Python Data Source API and batch data reader (#1291, #1336, #1353, and #1374).
Added support for partition transforms in the Spark DataFrameWriterV2 API (#1307).
Added support for subquery expressions in the Spark DataFrame API (#1289 and #1356).
Added support for the DESCRIBE TABLE SQL statement (#1364).
Added support for geospatial types for Spark 4.1 (#1325).
Added support for duplicated CTE names with shadowing behavior (#1331).
Added support for the TRY_CAST expression in SQL queries (#1349).
Added support for the following SQL functions (#1071, #1264, #1322, #1323, #1324, and #1347):
- percentile
- monotonically_increasing_id
- histogram_numeric
- regexp_substr
- format_number
- format_string
Improved the following SQL functions (#1204, #1286, #1327, #1329, and #1341):
- sequence
- ntile
- concat
- array_concat
- concat_ws
- date_diff
- datediff
- date_format
- to_date
- unix_timestamp
- to_timestamp
- try_to_timestamp
- from_unixtime
Improved ON condition column resolution in semi joins and anti joins (#1357).
Improved DESCRIBE SQL statement parsing (#1366).
Improved the task stream logic in distributed query execution (#1367).

Contributors

Huge thanks to @davidlghellin, @pomykalakyle, @santosh-d3vpl3x (first-time contributor), @james-willis (first-time contributor), and @wudihero2 for your contributions!

0.5.0

February 6, 2026

Redesigned the control plane for distributed query execution (#1164, #1242, #1247, #1265, and #1280).
Added support for system catalog (#1216).
Added support for AWS Glue catalog (#1254 and #1279).
Added support for OneLake catalog (#1217 and #1228).
Added support for partition transforms for Iceberg REST catalog (#1269).
Added support for the SHOW CATALOGS and USE CATALOG SQL statements (#1288).
Improved Delta Lake integration (#1222).
Improved Iceberg integration (#1169).
Added support for the CREATE TABLE ... AS SELECT ... (CTAS) statement (#1236).
Added support for the inferSchema option for CSV data source (#1223).
Added support for DataFrame.colRegex() in the Spark DataFrame API (#1243).
Added support for StructType.toDDL() in the Spark DataFrame API (#1285).
Added support for the following SQL functions (#1200, #1206, #1218, #1253, #1258, #1263, #1268, and #1276):
- percentile_disc
- to_json
- regex_extract
- soundex
- randstr
- array_sort (without the lambda argument)
- array_join
- array_concat
- try_url_decode
Improved the following SQL functions (#1186, #1187, #1252, #1256, #1257, #1260, #1262, #1277, and #1282):
- json_array_length
- get_json_object
- json_object_keys
- first_value
- last_value
- skewness
- kurtosis
- collect_set
- collect_list
- max_by
- min_by
- count_if
- arrays_zip
- flatten
Added support for negation and the signum SQL function for interval data types (#1275).
Fixed issues with the Column.try_cast() method in the Spark DataFrame API to handle invalid date and timestamp values correctly (#1221).
Improved the DataFrame.randomSplit() method in the Spark DataFrame API to ensured deterministic order (#1235).
Improved the join reorder optimizer (#1234).
Added memory and disk configuration options (#1311).
Added support for inferring default catalog when only one catalog is configured (#1311).

Breaking Changes

Python 3.9 is no longer supported since this version has reached its end-of-life (EOL) (#1302).
The SparkConnectServer.init_telemetry() method was removed from the Python API (#1302 and #1319). OpenTelemetry is now initialized automatically when the first SparkConnectServer instance is created, and OpenTelemetry shutdown is registered as a Python atexit handler.
The cluster.worker_stream_buffer configuration option was renamed to cluster.task_stream_buffer (#1309).
The cluster.job_output_buffer configuration option was removed since it is no longer needed (#1316).

Contributors

Huge thanks to @davidlghellin, @pomykalakyle, @djouallah (first-time contributor), @fafacao86 (first-time contributor), and @wudihero2 (first-time contributor) for your contributions!

0.4.6

January 13, 2026

Improved Delta Lake integration (#1146, #1159, #1158, and #1161).
Improved the internals for session management (#1138).
Added support for reading CSV files with truncated rows (#1185).
Added the configuration option for default parallelism (#1198).
Added the percentile_cont SQL aggregate function (#1188).
Added support for non-literal expressions for map extraction (#1193).
Added support for wildcard for the struct SQL function (#1197).
Fixed an issue with null value handling in the map_concat SQL function (#1194).
Updated the PySpark compatibility checker example and function support status (#1127).
Updated the TPC-H benchmark example (#1179).
Added support for Spark 4.1.1 (#1199).

Contributors

Huge thanks to @davidlghellin, @keen85, and @pomykalakyle (first-time contributor) for your contributions!

0.4.5

December 22, 2025

Added basic support for the Delta Lake merge operation (#1093, #1133, #1139, and #1144).
Improved distributed query execution (#1128, #1134, #1135, and #1137).
Improved Spark Connect server logic (#1126 and #1140).
Added support for removing sessions (#1125).
Added support for metrics and checkpoints for Delta Lake (#1136).
Improved OpenTelemetry metric reporting (#1119).
Improved the following SQL functions (#1105):
- make_dt_interval
- make_interval
- hex
- elt
Updated Parquet configuration options (#1141).
Updated the Spark Connect protocol for Spark 4.1 (#1145 and #1148).
Fixed an issue with the EXPLAIN statement output (#1147).

Contributors

Huge thanks to @davidlghellin for your contributions!

0.4.4

December 12, 2025

Improved Delta Lake and Iceberg integration (#1098, #1095, #1108, #1115, #1109, and #1117).
Added support for exporting logs, metrics, and traces to OpenTelemetry collectors (#1097, #1104, and #1116).
Added a Python example for reporting Sail compatibility for PySpark code (#1075).
Added support for customizing pod labels for Sail workers in Kubernetes deployments (#1103).
Added support for the following SQL functions (#1106):
- shuffle
- bitwise_not
- format_string
Improved the output of the EXPLAIN statement (#1110).
Fixed a few shuffle planning issues in distributed query execution (#1111).
Fixed an issue with the LIMIT clause in distributed query execution (#1121).
Improved data source implementation (#1099).

Contributors

Huge thanks to @davidlghellin, @zemin-piao, @keen85 (first-time contributor), @YichiZhang0613 (first-time contributor), and @gstvg (first-time contributor) for your contributions!

0.4.3

November 26, 2025

Added schema evolution support for Iceberg (#1048).
Improved the following SQL functions (#1049, #1056, #1057, and #1077):
- max_by
- min_by
- signum
- greatest
- least
- div
Added support for EXPLAIN in SQL statements (#1078).

Contributors

Huge thanks to @davidlghellin for your contributions!

0.4.2

November 13, 2025

Added support for column mapping for Delta Lake (#985).
Added support for time travel for Iceberg (#1039).
Added support for Unity Catalog (#1005).
Improved Iceberg integration (#1006, #1009, and #1042).
Added the luhn_check SQL function (#909).
Improved the following SQL functions (#909 and #1024):
- bit_count
- bit_get
- getbit
- crc32
- sha
- sha1
- expm1
- pmod
- width_bucket
- bitmap_count
- to_date
Added the try_avg SQL aggregate function (#1012).
Added support for the try_sum and try_avg SQL aggregate functions in window expressions (#1040).

Contributors

Huge thanks to @davidlghellin for the contribution!

0.4.1

November 2, 2025

Added support for writing partitioned Iceberg tables (#1003).
Added the try_sum SQL aggregate function (#960).
Fixed a filter pushdown performance issue (#1008).

Contributors

Huge thanks to @davidlghellin for the contribution!

0.4.0

October 29, 2025

Added basic support for reading and writing Iceberg tables (#944, #987, #976, #994, and #997).
Added support for Iceberg REST catalog (#961, #974, #993, and #995).
Improved Delta Lake integration (#921).
Added support for multiple arguments for the count_distinct SQL function (#957).
Added guide for HDFS Kerberos authentication (#992).
Updated a few execution configuration options (#975).
Fixed a cost estimation issue with the join reorder optimizer (#969).

Contributors

Huge thanks to @SparkApplicationMaster, @davidlghellin, and @zemin-piao (first-time contributor) for the contributions!

0.3.7

October 3, 2025

Improved error reporting for the SQL parser (#938).
Added support for the DataFrame.unpivot() method in the Spark DataFrame API (#948).

Contributors

Huge thanks to @SparkApplicationMaster for the continued contributions!

0.3.6

September 30, 2025

Added support for the binary file format (#853).
Implemented an experimental join reorder physical optimizer using the DPhyp algorithm (#810 and #917). This optimizer is not enabled by default but can be enabled via configuration options.
Added support for file metadata caching to improve read performance for the Parquet data source (#928).
Added support for the PySpark UDF applyInArrow() method in the Spark DataFrame API for grouped and cogrouped data (#886 and #887).
Added support for time travel for Delta Lake (#854).
Added support for the delete operation for Delta Lake (#856).
Improved Delta Lake integration (#848 and #916).
Added support for the following SQL functions (#820, #841, #824, #843, #835, #855, #859, and #860):
- elt
- inline
- inline_outer
- try_parse_url
- stack
- make_dt_interval
- version
- months_between
- user
- session_user
Improved the following SQL functions (#841, #847, #878, #920, #926, and #914):
- array
- try_multiply
- map_from_arrays
- map_from_entries
- approx_count_distinct
Added support for using all aggregate functions in window expressions (#861).
Fixed issues with sorting by aggregate expressions (#915).
Fixed issues with session key generation when the user ID is missing on the Windows platform (#849).
Continued the work for data streaming support (#832).
Added batch view creation endpoints in the MCP server (#875).
Added an example of using Kustomize with pod templates for Sail workers (#833).
Fixed input repartitioning issues for PySpark UDTFs (#662).
Fixed issues with the DataFrame.replace() method in the Spark DataFrame API (#891).
Added support for the REAL data type in the SQL parser (#892).
Fixed various literal parsing issues in the SQL parser (#868, #872, and #873).
Fixed issues with PySpark UDFs with no arguments (#895).

Contributors

Huge thanks to @SparkApplicationMaster, @davidlghellin, and @rafafrdz for the continued contributions!

0.3.5

September 5, 2025

Fixed issues with writing partitioned data to Delta Lake tables (#837).
Improved type inference for NULL map values in the VALUES SQL clause (#829).

Contributors

Huge thanks to @SparkApplicationMaster for the continued contributions!

0.3.4

September 3, 2025

Added support for the text file format (#737 and #813).
Added support for the Spark DataFrame streaming API and added a few data sources/sinks for testing purposes (#751). This provides a foundation for streaming support in Sail but is not ready for general use yet.
Improved the internals of the Delta Lake integration (#768 and #794).
Improved idle session handling (#761 and #818).
Fixed performance issues with the DataFrame.show() method in the Spark DataFrame API (#790).
Fixed issues with reading and writing compressed files (#760).
Fixed SQL parsing issues with negated predicates (#776).
Fixed issues with the DataFrame.withColumnsRenamed() method in the Spark DataFrame API (#764).
Fixed issues with the DataFrame.withColumns() method in the Spark DataFrame API (#814).
Added support for the following SQL functions (#727, #682, #774, #777, #779, #787, #762, #795, and #798):
- try_mod
- make_interval
- map_entries
- map_from_entries
- map_concat
- str_to_map
- width_bucket
- regexp_instr
Improved the following SQL functions (#682, #767, #769, #777, #722, #785, #789, #795, #801, and #806):
- try_add
- try_divide
- try_multiply
- try_subtract
- nth_value
- median
- map
- map_from_arrays
- split
- element_at
- try_element_at
- position
- locate
- get_json_object
- json_object_keys
- collect_list
Improved a few window functions to return the correct types of integers (#765).
Improved the implementation of array functions (#786).
Improved the implementation of string functions (#798).

Contributors

Huge thanks to @SparkApplicationMaster, @davidlghellin, and @rafafrdz for the continued contributions!

0.3.3

August 14, 2025

Fixed issues with physical planning to avoid performance degradation when querying Delta Lake tables (#750).
Fixed issues with the Catalog.getTable() method in the Spark DataFrame API (#752).
Added support for NaN values in VALUES (#739).
Fixed issues with the parquet.bloom_filter_on_write configuration option not being respected (#735).
Added support for the following SQL functions (#670 and #725):
- try_to_number
- convert_timezone
- make_timestamp_ltz
Improved the following SQL functions (#725, #730, #734, #743, #754, and #756):
- make_timestamp
- from_utc_timestamp
- to_utc_timestamp
- skewness
- kurtosis
- ln
- log
- log10
- log1p
- log2
- acos
- acosh
- asin
- asinh
- atan
- atan2
- atanh
- cbrt
- cos
- cosh
- cot
- csc
- degrees
- exp
- radians
- sec
- sin
- sinh
- sqrt
- tan
- tanh
- json_array_length
- map_contains_key

Contributors

Huge thanks to @SparkApplicationMaster and @rafafrdz for the continued contributions!

0.3.2

August 8, 2025

Added support for reading and writing Delta Lake tables (#578, #634, #677, #680, #716, #717, and #723).
Added support for Azure storage services and Google Cloud Storage (GCS), and improved support for S3 (#616 and #706).
Added support for file listing cache and file statistics cache (#709 and #712).
Added support for the following SQL functions and operators (#529, #580, #633, #645, #638, #654, #539, #661, #629, #676, #672, #635, #683, #702, #698, #708, #713, and #719):
- from_csv
- bround
- conv
- csc
- sec
- bit_count
- bit_get
- getbit
- shiftrightunsigned
- >>>
- ~
- array_insert
- listagg
- string_agg
- parse_url
- url_decode
- url_encode
- bitmap_bit_position
- bitmap_bucket_number
- bitmap_count
- to_number
- to_utc_timestamp
- try_add
- try_divide
- try_multiply
- try_subtract
- monthname
- arrays_zip
- is_valid_utf8
- try_validate_utf8
- validate_utf8
- make_valid_utf8
Added support for the Column.try_cast() method in the Spark DataFrame API (#694).
Improved the following SQL functions (#609, #613, #619, #621, #623, #617, #640, #644, #642, #643, #647, #660, #666, #674, and #701):
- date_part
- datepart
- extract
- nullifzero
- zeroifnull
- array_contains
- array_position
- array_append
- array_prepend
- array_size
- cardinality
- size
- array_agg
- collect_set
- flatten
- arrays_overlap
- concat
- map
- ltrim
- rtrim
- trim
- avg
- to_unix_timestamp
Fixed issues with the DataFrame.na.drop() and DataFrame.dropna() methods in the Spark DataFrame API (#693).
Fixed issues with casting timestamp and interval values from and to numeric values (#691).
Fixed incorrect eager execution behavior of the CASE expression (#649).
Fixed issues with PySpark UDF and UDTF execution (#652 and #658).
Fixed issues with expression naming (#668 and #685).
Improved the implementation of SQL math functions (#699).
Improved the internals of catalog management, data reader, and data writer (#592, #615, #628, #632, #681, #688, #705, and #707).

Contributors

Shoutout to @SparkApplicationMaster for contributions across bug fixes, features, and enhancements! Huge thanks to @rafafrdz, @davidlghellin, @anhvdq (first-time contributor), and @jamesfricker (first-time contributor), for helping to further extend our parity with Spark SQL functions!

0.3.1

July 7, 2025

Added support for the following SQL functions (#570, #571, #582, #585, and #586):
- dayname
- nullifzero
- zeroifnull
- split (partial support)
- collect_set
- count_if
Fixed issues with the from_utc_timestamp SQL function (#596).
Added support for the DataFrame.sampleBy() method in the Spark DataFrame API (#547).
Added support for the following SQL statements (#588):
- SHOW COLUMNS
- SHOW DATABASES
- SHOW TABLES
- SHOW VIEWS
Improved data source listing performance (#579).
Improved the internal logic of data source options (#587 and #598).
Updated gRPC server TCP and HTTP configuration (#593).

Contributors

Huge thanks to @SparkApplicationMaster for the first contributions related to SQL functions! Huge thanks to @davidlghellin for the continued contributions related to the Spark DataFrame API!

0.3.0

June 28, 2025

The 0.3.0 release introduces support for Spark 4.0 in Sail, alongside the existing support for Spark 3.5. One of the most notable changes in Spark 4.0 is the new pyspark-client package, a lightweight PySpark client. When using Sail in your PySpark applications, you can now choose to install this client package, instead of the full pyspark package that includes all the JAR files.

Here is a summary of the new features and improvements in this release.

Improved remote data access performance by caching object stores (#515).
Added support for data reader and writer configuration (#466 and #535).
Added support for the following SQL functions (#527):
- crc32
- sha
- sha1
Fixed issues with casting integers to timestamps (#533).
Fixed issues with the random and randn SQL functions (#530).
Added support for the DataFrame.sample() method in the Spark DataFrame API (#496).
Added support for Spark 4.0 (#467, #498, and #559).
Updated the default value of a few configuration options (#565).

Breaking Changes

The spark "extra" has been removed from the pysail package. As a result, you can no longer use commands like pip install pysail[spark] to install Sail along with Spark. Instead, you must install the PySpark package separately in your Python environment.

This change allows you to freely choose the PySpark version when using Sail. Depending on your requirements, you can opt for either the pyspark package (Spark 3.5 or later) or the pyspark-client package (introduced in Spark 4.0).

Contributors

We are thrilled by the growing interest from the community. Huge thanks to @rafafrdz, @davidlghellin, @lonless9, and @pimlie for making their first contributions to Sail!

0.2.6

May 14, 2025

Improved temporal data type casting and display (#448).
Corrected the time unit for reading INT96 timestamp data from Parquet files (#444).
Fixed issues with column metadata in the Spark DataFrame API (#447).
Added support for referring to aliased aggregation expressions in Spark SQL GROUP BY and HAVING clauses (#456).
Added support for more data formats and added directory listing endpoints in the MCP server (#455 and #458).

0.2.5

April 22, 2025

Corrected Spark session default time zone configuration and fixed various issues for timestamp data types (#438).
Improved object store setup and cluster mode task execution (#432).

0.2.4

April 10, 2025

Improved MCP server logging (#421).
Improved AWS S3 data access (#426).
Added support for AWS credential caching (#430).
Fixed issues with cluster mode task execution (#429).
Added support for exceptAll() and tail() in the Spark DataFrame API (#417).

0.2.3

March 21, 2025

Implemented MCP (Model Context Protocol) server (#410).
Added support for the hf:// protocol for reading Hugging Face datasets (#412).
Added support for glob patterns in data source URLs (#415).
Added support for a few data reader and writer options for CSV files (#414).
Fixed a few issues with SQL temporary views (#413).
Improved task error reporting in cluster mode (#409).

0.2.2

March 6, 2025

Switched to the built-in SQL parser (#338, #358, #359, and #376).
Added support for the majority of Spark SQL syntax (#378, #380, #382, #385, #387, #389, and #390).
Expanded support for Spark SQL functions (#364, #384, and #391).
Fixed issues with join() in the Spark DataFrame API (#392).
Added support for NATURAL JOIN in Spark SQL (#396).
Fixed an issue with SQL window expressions (#386).
Fixed result parity issues with derived TPC-DS queries (#393).

0.2.1

January 15, 2025

Added support for SQL table functions and lateral views (#326 and #327).
Added support for PySpark UDTFs (#329).
Improved literal and data type support (#317, #328, #330, and #339).
Added support for ANTI JOIN and SEMI JOIN (#337).
Fixed a few PySpark UDF issues (#343).
Added support for nested fields in SQL (#340).
Added support for more queries in the derived TPC-DS benchmark (#346).
Added support for more datetime functions (#349).

0.2.0

December 3, 2024

We are excited to announce the first Sail release with the distributed processing capability. Spark SQL and DataFrame queries can now run on Kubernetes, powered by the Sail distributed compute engine. We also introduced a new Sail CLI and a configuration mechanism that will serve as the entrypoint for all Sail features moving forward.

We continued extending coverage for Spark SQL functions and the Spark DataFrame API. The changes are listed below.

Added support for the following DataFrame and SQL functions (#278 and #305).
- DataFrame.crosstab
- DataFrame.replace
- DataFrame.to
- reverse
- aes_decrypt
- aes_encrypt
- try_aes_decrypt
- base64
- unbase64
- weekofyear
Added support for mapInPandas() and mapInArrow() for Spark DataFrame (#310).
Added support for applyInPandas() for grouped and co-grouped Spark DataFrame (#313).

Breaking Changes

This release comes with the new Sail CLI, and the way to launch the Spark Connect server and PySpark shell is different from the 0.1.x versions. Please refer to the Getting Started page for the updated instructions.

0.1.7

November 1, 2024

Expanded support for Spark DataFrame functions (#268 and #261). Added full parity and coverage for the following DataFrame and SQL functions.
- DataFrame.summary
- DataFrame.describe
- DataFrame.corr
- DataFrame.cov
- DataFrame.stat
- DataFrame.drop
- corr
- regr_avgx
Fixed most issues with ORDER BY in the derived TPC-DS benchmark, bringing total coverage to 74 out of the 99 queries (#261).

We also made significant changes to the Sail internals to support distributed processing. We are targeting the 0.2.0 release in the next few weeks for an MVP (minimum viable product) of this exciting feature. Please stay tuned! If you are interested in the ongoing work, you can follow #246 in our GitHub repository to get the latest updates!

0.1.6

October 23, 2024

Added support for UNION by name (#253).
Fixed a few issues with column references (#254 and #257).

0.1.5

October 17, 2024

Expanded support for Spark SQL syntax and functions (#239 and #247). Added full parity and coverage for the following SQL functions.
- current_catalog
- current_database
- current_schema
- hash
- hex
- unhex
- xxhash64
- unix_timestamp
Fixed a few issues with JOIN (#250).

0.1.4

October 03, 2024

Enabled Avro in DataFusion (#234).
Expanded support for Spark SQL syntax and functions (#213 and #207). Added full parity and coverage for the following SQL functions.
- array
- date_format
- get_json_object
- json_array_length
- overlay
- replace
- split_part
- to_date
- any_value
- approx_count_distinct
- current_timezone
- first_value
- greatest
- last
- last_value
- least
- map_contains_key
- map_keys
- map_values
- min_by
- substr
- sum_distinct
Added support for HDFS (#196).
Added support for parsing value prefixes followed by whitespace (#218 and lakehq/sqlparser-rs#6).
Added basic support for Python UDAF (#214).

Contributors

Huge thanks to our first community contributor, @skewballfox for adding support for HDFS!!

0.1.3

September 18, 2024

Added support for column positions in GROUP BY and ORDER BY (#205).
Expanded support for INSERT statements (#195).
Fixed issues with Spark configuration (#192).
Expanded support for CREATE and REPLACE statements (#183).
Added support for GROUPING SETS aggregation (#184).
Integrated fastrace for more performant logging and tracing (#166).
Enabled gzip and zstd compression in Tonic (#166).

0.1.2

September 10, 2024

Fixed issues with aggregation queries.
Extended support for SQL functions.
Added support for temporary views and global temporary views.

0.1.1

September 03, 2024

Extended support for SQL statements and SQL functions.
Fixed a performance issue for the PySpark DataFrame show() method.

0.1.0

August 29, 2024

This is the first Sail release.

Changelog ​

0.5.3 ​

Contributors ​

0.5.2 ​

Contributors ​

0.5.1 ​

Contributors ​

0.5.0 ​

Breaking Changes ​

Contributors ​

0.4.6 ​

Contributors ​

0.4.5 ​

Contributors ​

0.4.4 ​

Contributors ​

0.4.3 ​

Contributors ​

0.4.2 ​

Contributors ​

0.4.1 ​

Contributors ​

0.4.0 ​

Contributors ​

0.3.7 ​

Contributors ​

0.3.6 ​

Contributors ​

0.3.5 ​

Contributors ​

0.3.4 ​

Contributors ​

0.3.3 ​

Contributors ​

0.3.2 ​

Contributors ​

0.3.1 ​

Contributors ​

0.3.0 ​

Breaking Changes ​

Contributors ​

0.2.6 ​

0.2.5 ​

0.2.4 ​

0.2.3 ​

0.2.2 ​

0.2.1 ​

0.2.0 ​

Breaking Changes ​

0.1.7 ​

0.1.6 ​

0.1.5 ​

0.1.4 ​

Contributors ​

0.1.3 ​

0.1.2 ​

0.1.1 ​

0.1.0 ​

Changelog

0.5.3

Contributors

0.5.2

Contributors

0.5.1

Contributors

0.5.0

Breaking Changes

Contributors

0.4.6

Contributors

0.4.5

Contributors

0.4.4

Contributors

0.4.3

Contributors

0.4.2

Contributors

0.4.1

Contributors

0.4.0

Contributors

0.3.7

Contributors

0.3.6

Contributors

0.3.5

Contributors

0.3.4

Contributors

0.3.3

Contributors

0.3.2

Contributors

0.3.1

Contributors

0.3.0

Breaking Changes

Contributors

0.2.6

0.2.5

0.2.4

0.2.3

0.2.2

0.2.1

0.2.0

Breaking Changes

0.1.7

0.1.6

0.1.5

0.1.4

Contributors

0.1.3

0.1.2

0.1.1

0.1.0