Polars: rs-0.41.3 Release

Release date:
July 2, 2024
Previous version:
rs-0.41.2 (released June 24, 2024)
Magnitude:
6,632 Diff Delta
Contributors:
17 total committers
Data confidence:
Commits:

61 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Top Contributors in rs-0.41.3

stinodego
alexander-beedie
coastalwhite
orlp
nameexhaustion
flisky
ritchie46
adamreeve
mcrumiller
wence-

Directory Browser for rs-0.41.3

All files are compared to previous version, rs-0.41.2. Click here to browse diffs between other versions.

Loading File Browser...

Release Notes Published

πŸš€ Performance improvements

  • Improve unique performance by adding RangedUniqueKernel for primitive arrays (#17166)
  • faster decode on Parquet HybridRLE (#17208)

✨ Enhancements

  • Add SQL support for NATURAL joins and the COLUMNS function (#17295)
  • Add str.extract_many expression (#17304)
  • Support '%' in pathnames for async scan (#17271)
  • Support SQL Struct/JSON field access operators (#17226)
  • Exclude directories from glob expansion result (#17174)
  • Support SQL ORDER BY ALL syntax (#17212)
  • Support PostgreSQL ^@ ("starts with"), and ~~,~~*,!~~,!~~* ("like", "ilike") string-matching operators (#17251)
  • Support SQL SELECT * ILIKE wildcard syntax (#17169)
  • Support SQL temporal functions STRFTIME and STRPTIME, and typed literal syntax (#17245)
  • Support date/datetime for hive parts (#17256)
  • Expose some more information in translated expression IR to python (#17209)
  • Allow no-op round/ceil/floor on integer types (#17241)
  • Support loading from datasets where the hive columns are also stored in the file (#17203)
  • Implement serde for Null columns (#17218)
  • Support Decimal types in write_csv/write_json (#14209)
  • Improve SQL support for array indexing, increase test coverage (#16972)
  • Support reading byte stream split encoded floats and doubles in parquet (#17099)
  • Add float_scientific option to write_csv/sink_csv (#17111)

🐞 Bug fixes

  • Raise proper error for mismatching parquet schema instead of panicking (#17321)
  • Raise on invalid shape dataframe arithmetic (#17322)
  • Fix panic in window case (#17320)
  • Raise errors instead of panicking when sink_csv fails (#17313)
  • Raise if join keys are passed to cross join (#17305)
  • Don't null on oob in list.get for column index (#17276)
  • Fix issue where sliced PyArrow record batches were not handled correctly (#17058)
  • Don't oob on nulls in list.get (#17262)
  • Fix list getter with nulls (#17261)
  • Respect nulls_last parameter in aggregate sort_by (#17249)
  • Fix literal slice in group by (#17242)
  • Fix DataFrame.top_k not handling nulls correctly (#17239)
  • Avoid using the regex dependency when the regex feature is not used (#17206)
  • properly check the BMI2 uleb128 (#17191)

πŸ“– Documentation

  • Minor layout/terminology improvement for selector set ops (#17299)
  • Fix polars-plan docs.rs build (#17266)
  • Add SQL docs for the CAST and TRY_CAST functions (#17214)

πŸ› οΈ Other improvements

  • Prefer ParquetError::oos to ParquetError::OutOfSpec (#17314)
  • remove seqmacro and u8,u16 bitpack (#17290)
  • Fix typo in join validation error message (#17296)
  • Use typed iter in list.get (#17286)
  • add ability to have pipeline blockers in new streaming engine (#17247)
  • Support date/datetime for hive parts (#17256)
  • Add elementwise select and with_columns to new streaming engine (#17185)
  • chrono's ParseErrorKind is now public (#17201)

Thank you to all our contributors for making this release possible! @IvanIsCoding, @JamesCE2001, @MarcoGorelli, @SeanTater, @adamreeve, @alexander-beedie, @coastalwhite, @datapythonista, @flisky, @itamarst, @jqnatividad, @lukeshingles, @mcrumiller, @nameexhaustion, @orlp, @ritchie46, @stinodego and @wence-