Polars: py-1.35.0-beta.1 Release

Release date:
October 19, 2025
Previous version:
py-1.34.0 (released October 2, 2025)
Magnitude:
14,102 Diff Delta
Contributors:
15 total committers
Data confidence:
Commits:

95 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored October 7, 2025
Authored October 10, 2025
Authored October 10, 2025

Top Contributors in py-1.35.0-beta.1

coastalwhite
nameexhaustion
orlp
kdn36
alexander-beedie
henryharbeck
ritchie46
math-hiyoko
pavelzw
thomasjpfan

Directory Browser for py-1.35.0-beta.1

All files are compared to previous version, py-1.34.0. Click here to browse diffs between other versions.

Loading File Browser...

Release Notes Published

πŸš€ Performance improvements

  • Address group_by_dynamic slowness in sparse data (#24916)
  • Push filters to PyIceberg (#24910)
  • Native filter/drop_nulls/drop_nans in group-by context (#24897)
  • Implement cumulative_eval using the group-by engine (#24889)
  • Prevent generation of copies of Dataframes in DslPlan serialization (#24852)
  • Implement native null_count, any and all group-by aggregations (#24859)
  • Speed up reverse in group-by context (#24855)
  • Prune unused categorical values when exporting to arrow/parquet/IPC/pickle (#24829)
  • Don't check duplicates on streaming simple projection in release mode (#24830)
  • Lower approx_n_unique to the streaming engine (#24821)
  • Duration/interval string parsing optimisation (2-5x faster) (#24771)
  • Use native reducer for first/last on Decimals, Categoricals and Enums (#24786)
  • Implement indexed method for BitMapIter::nth (#24766)
  • Pushdown slices on plans within unions (#24735)

✨ Enhancements

  • Add environment variable to roundtrip empty struct in Parquet (#24914)
  • Fast-count for scan_iceberg().select(len()) (#24602)
  • Add glob parameter to scan_ipc (#24898)
  • Prevent generation of copies of Dataframes in DslPlan serialization (#24852)
  • Add list.agg and arr.agg (#24790)
  • Implement {Expr,Series}.rolling_rank() (#24776)
  • Don't require PyArrow for read_database_uri if ADBC engine version supports PyCapsule interface (#24029)
  • Make Series init consistent with DataFrame init for string values declared with temporal dtype (#24785)
  • Support MergeSorted in CSPE (#24805)
  • Duration/interval string parsing optimisation (2-5x faster) (#24771)
  • Recursively apply CSPE (#24798)
  • Add streaming engine per-node metrics (#24788)
  • Add arr.eval (#24472)
  • Drop PyArrow requirement for non-batched usage of read_database with the ADBC engine and support iter_batches with the ADBC engine (#24180)
  • Improve rolling_(sum|mean) accuracy (#24743)
  • Add separator to {Data,Lazy}Frame.unnest (#24716)
  • Add union() function for unordered concatenation (#24298)
  • Add name.replace to the set of column rename options (#17942)
  • Support np.ndarray -> AnyValue conversion (#24748)
  • Allow duration strings with leading "+" (#24737)
  • Drop now-unnecessary post-init "schema_overrides" cast on DataFrame load from list of dicts (#24739)
  • Add support for UInt128 to pyo3-polars (#24731)

🐞 Bug fixes

  • Properly release the GIL for read_parquet_metadata (#24922)
  • Broadcast partition_by columns in over expression (#24874)
  • Clear index cache on stacked df.filter expressions (#24870)
  • Fix 'explode' mapping strategy on scalar value (#24861)
  • Fix repeated with_row_index() after scan() silently ignored (#24866)
  • Correctly return min and max for enums in groupby aggregation (#24808)
  • Refactor BinaryExpr in group_by dispatch logic (#24548)
  • Fix aggstate for gather (#24857)
  • Keep scalars for length preserving functions in group_by (#24819)
  • Have range feature depend on dtype-array feature (#24853)
  • Fix duplicate select panic (#24836)
  • Inconsistency of list.sum() result type with None values (#24476)
  • Division by zero in Expr.dt.truncate (#24832)
  • Potential deadlock in __arrow_c_stream__ (#24831)
  • Allow double aggregations in group-by contexts (#24823)
  • Series.shrink_dtype for i128/u128 (#24833)
  • Fix dtype in EvalExpr (#24650)
  • Allow aggregations on AggState::LiteralScalar (#24820)
  • Dispatch to group_aware for fallible expressions with masked out elements (#24815)
  • Fix error for arr.sum() on small integer Array dtypes containing nulls (#24478)
  • Fix regression on write_database() to Snowflake due to unsupported string view type (#24622)
  • Fix XOR did not follow kleene when one side is unit-length (#24810)
  • Make Series init consistent with DataFrame init for string values declared with temporal dtype (#24785)
  • Incorrect precision in Series.str.to_decimal (#24804)
  • Use overlapping instead of rolling (#24787)
  • Fix iterable on dynamic_group_by and rolling object (#24740)
  • Use Kahan summation for in-memory groupby sum/mean (#24774)
  • Release GIL in PythonScan predicate evaluation (#24779)
  • Type error in bitmask::nth_set_bit_u64 (#24775)
  • Add Expr.sign for Decimal datatype (#24717)
  • Correct str.replace with missing pattern (#24768)
  • Ensure schema_overrides is respected when loading iterable row data (#24721)
  • Support decimal_comma on Decimal type in write_csv (#24718)

πŸ“– Documentation

  • Add partitioning examples for sink_* methods (#24918)
  • Add more {unique,value}_counts examples (#24927)
  • Indent the versionchanged (#24783)
  • Relax fsspec wording (#24881)
  • Add pl.field into the api docs (#24846)
  • Fix duplicated article in SECURITY.md (#24762)
  • Document output name determination in when/then/otherwise (#24746)
  • Specify that precision=None becomes 38 for Decimal (#24742)
  • Mention polars[rt64] and polars[rtcompat] instead of u64-idx and lts-cpu (#24749)
  • Fix source mapping (#24736)

πŸ“¦ Build system

  • Update pyo3 and numpy crates to version 0.26 (#24760)

πŸ› οΈ Other improvements

  • Re-use iterators in set_ operations (#24850)
  • Remove GroupByPartitioned and dispatch to streaming engine (#24903)
  • Turn element() into {A,}Expr::Element (#24885)
  • Pass ScanOptions to new_from_ipc (#24893)
  • Update tests to be index type agnostic (#24891)
  • Unset Context in Window expression (#24875)
  • Fix failing delta test (#24867)
  • Move FunctionExpr dispatch from plan to expr (#24839)
  • Fix SQL test giving wrong error message (#24835)
  • Consolidate dtype paths in ApplyExpr (#24825)
  • Add days_in_month to documentation (#24822)
  • Enable ruff D417 lint (#24814)
  • Turn pl.format into proper elementwise expression (#24811)
  • Fix remote benchmark by no-longer saving builds (#24812)
  • Refactor ApplyExpr in group_by context on multiple inputs (#24520)
  • IR text plan graph generator (#24733)
  • Temporarily pin pydantic to fix CI (#24797)
  • Extend and rename rolling groups to overlapping (#24577)
  • Refactor DataType proptest strategies (#24763)
  • Add union to documentation (#24769)

Thank you to all our contributors for making this release possible! @JakubValtar, @Kevin-Patyk, @MarcoGorelli, @Object905, @alexander-beedie, @borchero, @cmdlineluser, @coastalwhite, @craigalodon, @dsprenkels, @eitsupi, @etrotta, @henryharbeck, @jordanosborn, @kdn36, @math-hiyoko, @nameexhaustion, @orlp, @pavelzw, @ritchie46, @thomasjpfan and @williambdean