π Highlights
- common subexpression elemination (#9632)
π₯ Breaking changes
- remove deprecate tz_localize, name CastTimezone to ReplaceTimeZone (#10070)
β οΈ Deprecations
- renaming
approx_unique
as approx_n_unique
(#10290)
- remove/deprecate cache and its logic (#10066)
- Add
date_ranges
/time_ranges
expression functions (#10005)
π Performance improvements
- pre-alloc int_ranges (#10399)
- use hash as CSE Identifier (#10385)
- re-use regex capture allocation (#10302) (#10335)
- don't parallelize literal expressions (#10321)
- fix O(n^2) in sorted check during append (#10241)
- speedup mode on sorted data (#10084)
- speedup boolean apply (#10073)
- shrink alp/lp
~2.5x
(#10039)
- Remove fused arithmetic from expressions with literals (#10011)
β¨ Enhancements
- quote style option for csv writer (#10422)
- add "raise_if_empty" flag to
read_excel
, read_csv
, scan_csv
, and read_csv_batched
(#10409)
- be more permissive on predicate pushdown to left side of left join (#10442)
- add
use_earliest
to to_datetime
/ strptime
(#10426)
- {any/all}_horizontal to expression architecture (#10412)
- serialize flags (#10140)
- allow unaligned pointers in arrow FFI (#10403)
- add line_terminator option to write_csv (#10373)
- Add
is_local
and to_local
to categorical namespace (#10372)
- cse for groupby.agg and reduced cse collisions (#10381)
- re-use regex capture allocation (#10302) (#10335)
- Add
Series.cat.uses_lexical_ordering
(#10325)
- improve datetime parsing error message (#10332)
- allow sequential runners in select/with_columns (#10322)
- improve err msg parsing
time
, date
, datetime
(#10298)
- Add
str.extract_groups
(#10179)
- add extra build profiles (#10268)
- Extend
datetime
expression function with time zone/time unit parameters (#10235)
- added gcs to gcp cloud schema in polars-core::cloud #10206. (#10207)
- support writing duration type in json (#10112)
- inline
lit(Series).cast(..)
to -> lit(Series.cast(..))
(#10092)
- Move transpose naming to Rust (#10009)
- cse in groupby's (#10062)
- Adds sql
CASE
statement expressions (#10065)
- Add
date_ranges
/time_ranges
expression functions (#10005)
- comm_subexpr_elim in streaming 'select/with_columns' (#10050)
- common subexpression elemination (#9632)
- Let qcut create evenly spaced probabilities (#9960)
- sorted flag on singletons (#9933)
- maintain sorted flag after partition_by (#9944)
- keep sorted flag in streaming left join (#9932)
- Add cloudpickle for serializing python UDFs (#9921)
π Bug fixes
- Fix incorrect handling of VisitRecursion::Skip. (#10452)
- fix negative decimal parsing (#10444)
- ensure sorted_sink hash equals the default path (#10464)
- fix sum agg (#10459)
- ensure last aggregation deals with default chunk (#10453)
- fix cse input schema (#10450)
- fix list groupby of array dtype (#10408)
- correct AnyValue::hash (#10391)
- finalize cast in partitioned groupby (#10359)
- fix oob in 'last' (#10329)
- fix categorical lexical sort (#10318)
- Fix join validation (#10257)
- Set correct dtype for
.extract_groups()
(#10306)
- clear window cache and run windows on proper runners (#10303)
- fix sorted fast path in streaming groupby wrt nulls (#10289)
- fix nan aggregation in groupby (#10287)
- check dtypes of single-column 'by' parameter in asof-join (#10284)
- fix pyo3 link errors on macos (#10256)
- fix empty streaming parquet file (#10252)
- fix logical columns of streaming multi-column sort (#10250)
- fix date/datetime parsing for short inputs with exact=False (#10231)
- correct agg_sum for ChunkedArray. (#10243)
- don't panic in wildcard apply (#10240)
- fix cse profile (#10239)
- correct struct null counts (#10142)
- no cse in groupby until fixed (#10216)
- fix
is_in
on empty series (#10195)
- fix cse windows (#10197)
- block predicate pushdown is_in and null producing β¦ (#10194)
- prevent re-ordering of dict keys inside
.apply
(#10172)
- initialize fixed null values (#10192)
- ensure window function run partitioned when cse is hit (#10170)
- adjust for null values in str.replace fast path (#10132)
- clear bit settings in list iteration (#10131)
- use row-encoded for struct::is_sorted (#10129)
- fix(rust, python): don't run file-caching in streaming mode (#10117)
- Allow initialize of pl.Array in Dataframe using schema alone (#10100)
- don't panic if masked out values are invalid in temporal kernels (#10114)
- Fix struct get field by index out of bounds error. (#10097)
- fix ub in simd-json (#10093)
- fix invalid access when groupby rolling produces empty sets (#10109)
- respect
null_on_oob=False
in list.take
when pa⦠(#10105)
- fix is_sorted for structs (#10099)
- add file path to io error in scan_csv (#10076)
- fix false positive in parquet stats evaluation (#10087)
- fix error message from cast-timezone to replace-time-zone (#10089)
- Address
.col(regex).exclude()
operations not executing. (#10025)
- fix Boolean::isin(null values) (#10074)
- predicate pushdown #10058 (#10071)
- Fix weighted quantile for 0 weights (#10051)
- fix incorrect state in projection pushdown with joins (#9987)
- don't pass predicates referring to renamed literal⦠(#9965)
- fix regression in regex expansion (#9952)
- potential SO in csv infer schema (#9950)
- raise on unsupported transpose and object types (#9946)
- Fix as-of join when
by
groups are interleaved (#9938)
π οΈ Other improvements
- fix and run polars-plan tests (#10465)
- Simplify flag methods (#10429)
- match_block_trailing_comma (#10414)
- implement ChunkArray::(try_)from_chunk_iter (#10395)
- add test for 10401 (#10405)
- Bump some dependencies (#10396)
- Move dependency version info to workspace level (#10295)
- patch reedline until fix released (#10382)
- remove wasm-timer dependency (#10347)
- write down invariants of ChunkedArray (#10334)
- fix typo in lib.rs (#10313)
- Exclude examples from workspace default (#10309)
- Update CODEOWNERS (#10261)
- avoid outputting docs of dependencies (#10292)
- Do not keep history in
gh-pages
branch (#10282)
- Use workspace package info / organize dependencies section (#10279)
- fix dead links in Rust documentation (#10251)
- Fix
make pre-commit
command (#10205)
- Fix
make integration-tests
command (#10202)
- Replace "question" issues with link to Stack Overflow (#10230)
- Update dependabot config (#10222)
- Fix LICENSE symlink for moved crates (#10150)
- Re-organize folder structure for Rust crates (#10141)
- update to rustc nightly-2023-07-27 (#10139)
- temporarily turn off fail-fast so that ubuntu tests run (#10133)
- Refactor
when
/then
/otherwise
internals (#9922)
- move replace_time_zone to polars-ops (#10078)
- remove unneeded branch (#10082)
- remove deprecate tz_localize, name CastTimezone to ReplaceTimeZone (#10070)
- fix typo in contribution example (#10038)
- correct example in API reference (#10032)
- add developer contribution examples (#10013)
- Update autolabeler again (#9984)
- fix docs build and add to CI (#9904)
- Minor makeover for Rust Makefile (#9874)
Thank you to all our contributors for making this release possible!
@0xbe7a, @CanglongCl, @JulianCologne, @MarcoGorelli, @OndrejSlamecka, @OneRaynyDay, @SeanTroyUWO, @StefanBRas, @TLouf, @alexander-beedie, @c-peters, @cjackal, @cmdlineluser, @dependabot, @dependabot[bot], @drgif, @duvenagep, @eltociear, @fsimkovic, @ion-elgreco, @jonashaag, @lfn3, @magarick, @mcrumiller, @orlp, @potzenhotz, @rea1bacon, @reswqa, @rikkaka, @ritchie46, @stinodego, @thomasaarholt, @varunmittal91 and @zundertj