Polars: py-0.20.20 Release

Release date:
April 13, 2024
Previous version:
py-0.20.19 (released April 8, 2024)
Magnitude:
4,599 Diff Delta
Contributors:
20 total committers
Data confidence:
Commits:

50 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored April 10, 2024
Authored April 12, 2024
Authored April 13, 2024

Top Contributors in py-0.20.20

ritchie46
MarcoGorelli
alexander-beedie
mcrumiller
ChayimFriedman2
reswqa
JamesCE2001
TrevorWinstral
itamarst
nameexhaustion

Directory Browser for py-0.20.20

All files are compared to previous version, py-0.20.19. Click here to browse diffs between other versions.

Loading File Browser...

Release Notes Published

πŸš€ Performance improvements

  • Fix cross join batch size when one of the DataFrames is tiny (#14347)
  • Fix binview growable complexity O(n*m) -> O(n) (#15628)
  • Remove extra thread spawn from row group fetcher (#15626)
  • Use vertical parallelism if input is chunked for Filter,Select,WithColumns (#15608)
  • Refactor CSV serialization to not go thorough AnyValue (#15576)
  • don't use dynamic dispatch in visitors (#15607)
  • Improve Bitmap construction performance (#15570)
  • join by row-encoding (#15559)

✨ Enhancements

  • add Expr.dt.add_business_days and Series.dt.add_business_days (#15595)
  • Add str.head and str.tail (#14425)
  • Add union/or operator for pl.Enum (#14965)
  • Extended BytecodeParser to handle additional math functions, and imports from the global namespace (#15627)
  • Push down is_between expressions to Arrow (#15180)
  • add holidays argument to business_day_count (#15580)
  • change default to write parquet statistics (#15597)
  • Expressify to_integer (#15604)
  • Optimizer; remove double SORT and redundant projections (#15573)
  • Add null_on_oob parameter to expr.array.get (#15426)
  • support weekend argument in business_day_count (#15544)
  • Enable is_first/last_distinct for not nested non-numeric list (#15552)
  • Turn off cse if cache node found (#15554)
  • Tag concat list as elementwise (#15545)

🐞 Bug fixes

  • Return appropriate data type for time mean and median (#14471)
  • Fix issue in write_excel that could lead to incorrect spanning range determination (#15631)
  • Output correct dtype for mean_horizontal on a single column (#15118)
  • Recompute RowIndex schema after projection pd (#15625)
  • Mean of boolean in streaming group_by incorrectly always gave NULL (#15616)
  • Include cloud creds in cache key (#15609)
  • Fix elementwise-apply if any input is AggregatedScalar (#15606)
  • Explode list should take validity into account (#15572)
  • use larger recursive stack in debug mode (#15593)
  • SQL interface "off-by-one' indexing error with GROUP BY clauses that use position ordinals (#15584)
  • Enable missing features in polars-time (#15558)
  • Handle quoted identifiers when registering CTEs in the SQL engine (#15564)
  • Decompress moved out of schema initialization (#15550)
  • Turn off cse if cache node found (#15554)

πŸ“– Documentation

  • Add legacy CPU install instructions in user guide (#13676)
  • Examples for errors (#13724)
  • Add docstring examples for reading json (#14481)
  • Add security warning in LazyFrame.deserialize() docstring (#15282)
  • Various minor updates to User Guide's SQL intro section (#15557)

πŸ› οΈ Other improvements

  • Replace most deprecated calls with bounded version (#15632)
  • use bound api (#15630)
  • Initial PyO3 0.21 support (#15622)
  • Don't run streaming group-by in partitionable gb (#15611)
  • pref(rust!, python): Unify sort with SortOptions and SortMultipleOptions (#15590)
  • Set up CodSpeed (#15537)

Thank you to all our contributors for making this release possible! @CanglongCl, @ChayimFriedman2, @Fokko, @JamesCE2001, @MarcoGorelli, @NedJWestern, @TrevorWinstral, @alexander-beedie, @deanm0000, @douglas-raillard-arm, @eitsupi, @filabrazilska, @i-aki-y, @itamarst, @leoforney, @mcrumiller, @nameexhaustion, @orlp, @ozgrakkurt, @reswqa, @ritchie46 and @stinodego