Polars: py-1.34.0-beta.5 Release

Release date:
October 1, 2025
Previous version:
py-1.34.0-beta.4 (released September 28, 2025)
Magnitude:
965 Diff Delta
Contributors:
9 total committers
Data confidence:
Commits:

18 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored September 30, 2025
Authored September 30, 2025
Authored September 29, 2025
Authored September 29, 2025

Top Contributors in py-1.34.0-beta.5

orlp
nameexhaustion
coastalwhite
DeflateAwning
JakubValtar
kdn36
ritchie46
eitsupi
alexander-beedie

Directory Browser for py-1.34.0-beta.5

All files are compared to previous version, py-1.34.0-beta.4. Click here to browse diffs between other versions.

Loading File Browser...

Release Notes Published

πŸ† Highlights

  • Add LazyFrame.{sink,collect}_batches (#23980)
  • Deterministic import order for Python Polars package variants (#24531)

πŸš€ Performance improvements

  • Pushdown filter with strptime if input is literal (#24694)
  • Avoid copying expanded paths (#24669)
  • Relax filter expr ordering (#24662)
  • Remove unnecessary groups call in aggregated (#24651)
  • Skip files in scan_iceberg with filter based on metadata statistics (#24547)
  • Push row_index predicate for all scan types (#24537)
  • Perform integer in-filtering for Parquet inequality predicates (#24525)
  • Stop caching Parquet metadata after 8 files (#24513)
  • Native streaming .mode() expression (#24459)

✨ Enhancements

  • Implement maintain_order for cross join (#24665)
  • Add support to output dt.total_{}() duration values as fractionals (#24598)
  • Avoid forcing a pyarrow dependency in read_excel when using the default "calamine" engine (#24655)
  • Support scanning from file:/path URIs (#24603)
  • Log which file the schema was sourced from, and which file caused an extra column error (#24621)
  • Add LazyFrame.{sink,collect}_batches (#23980)
  • Deterministic import order for Python Polars package variants (#24531)
  • Add support to display lazy query plan in marimo notebooks without needing to install matplotlib or mermaid (#24540)
  • Add unstable hidden_file_prefix parameter to scan_parquet (#24507)
  • Use fixed-scale Decimals (#24542)
  • Add support for unsigned 128-bit integers (#24346)
  • Add unstable pl.Config.set_default_credential_provider (#24434)
  • Roundtrip BinaryOffset type through Parquet (#24344)
  • Add opt-in unstable functionality to load interval types as Struct (#24320)
  • Support reading parquet metadata from cloud storage (#24443)
  • Add user guide section on AWS role assumption (#24421)
  • Support unique / n_unique / arg_unique for array columns (#24406)

🐞 Bug fixes

  • Make Categories pickleable (#24691)
  • Shift on array within list (#24678)
  • Fix handling of AggregatedScalar in ApplyExpr single input (#24634)
  • Support reading of mixed compressed/uncompressed IPC buffers (#24674)
  • Overflow in slice-slice optimization (#24658)
  • Package discovery for setuptools (#24656)
  • Add type assertion to prevent out-of-bounds in GenericFirstLastGroupedReduction (#24590)
  • Remove inclusion of polars dir in runtime sdist/wheel (#24654)
  • Method dt.month_end was unnecessarily raising when the month-start timestamp was ambiguous (#24647)
  • Widen from_dicts to Iterable[Mapping[str, Any]] (#24584)
  • Fix unsupported arrow type Dictionary error in scan_iceberg() (#24573)
  • Raise Exception instead of panic when unnest on non-struct column (#24471)
  • Include missing feature dependency from polars-stream/diff to polars-plan/abs (#24613)
  • Newline escaping in streaming show_graph (#24612)
  • Do not allow inferring (-1) the dimension on any Expr.reshape dimension except the first (#24591)
  • Sink batches early stop on in-memory engine (#24585)
  • More precisely model expression ordering requirements (#24437)
  • Panic in zero-weight rolling mean/var (#24596)
  • Decimal <-> literal arithmetic supertype rules (#24594)
  • Match various aggregation return types in the streaming engine with the in-memory engine (#24501)
  • Validate list type for list expressions in planner (#24589)
  • Fix scan_iceberg() storage options not taking effect (#24574)
  • Have log() prioritize the leftmost dtype for its output dtype (#24581)
  • CSV pl.len() was incorrect (#24587)
  • Add support for float inputs for duration types (#24529)
  • Roundtrip empty string through hive partitioning (#24546)
  • Fix potential OOB writes in unaligned IPC read (#24550)
  • Fix regression error when scanning AWS presigned URL (#24530)
  • Make PlPath::join for cloud paths replace on absolute paths (#24514)
  • Correct dtype for cum_agg in streaming engine (#24510)
  • Restore support for np.datetime64() in pl.lit() (#24527)
  • Ignore Iceberg list element ID if missing (#24479)
  • Fix panic on streaming full join with coalesce (#23409)
  • Fix AggState on all_literal in BinaryExpr (#24461)
  • Show IR sort options in explain (#24465)
  • Benchmark CI import (#24463)
  • Fix schema on ApplyExpr with single row literal in agg context (#24422)
  • Fix planner schema for dividing pl.Float32 by int (#24432)
  • Fix panic scanning from AWS legacy global endpoint URL (#24450)
  • Fix iterable_to_pydf(..., infer_schema_length=None) to scan all data (#23405)
  • Do not propagate struct of nulls with null (#24420)
  • Be stricter with invalid NDJSON input when ignore_errors=False (#24404)
  • Implement approx_n_unique for temporal dtypes and Null (#24417)

πŸ“– Documentation

  • Add default parquet compression levels (#24686)
  • Fix syntax error in data-types-and-structures.md (#24606)
  • Rename avg_birthday -> avg_age in examples aggregation (#23726)
  • Update Polars Cloud user guide (#24366)
  • Fix typo in set_expr_depth_warning docstring (#24427)

πŸ“¦ Build system

  • Python pre-release 1.34.0b5 (#24699)
  • Use cargo-run to call dsl-schema script (#24607)

πŸ› οΈ Other improvements

  • Restructure python project directories again (#24676)
  • Use IR for polars-expr output field resolution (#24661)
  • Remove dist/ from release python workflow (#24639)
  • Escape sed ampersand in release script (#24631)
  • Remove PyOdide from release for now (#24630)
  • Fix sed in-place in release script (#24628)
  • Release script pyodide wheel (#24627)
  • Release script pyodide wheel (#24626)
  • Update release script for runtimes (#24610)
  • Remove unused UnknownKind::Ufunc (#24614)
  • Use cargo-run to call dsl-schema script (#24607)
  • Cleanup and prepare to_field for element and struct field context (#24592)
  • Resolve nightly clippy hints (#24593)
  • Rename pl.dependencies to pl._dependencies (#24595)
  • More release scripting (#24582)
  • Again a minor fix for the setup script (#24580)
  • Minor fix in release script (#24579)
  • Correct release python beta version check (#24578)
  • Python dependency failure (#24576)
  • Always install yq (#24570)
  • Deterministic import order for Python Polars package variants (#24531)
  • Check Arrow FFI pointers with an assert (#24564)
  • Add a couple of missing type definitions in python (#24561)
  • Fix quickstart example in Polars Cloud user guide (#24554)
  • Add implementations for loading min/max statistics for Iceberg (#24496)
  • Update versions (#24508)
  • Add additional unit tests for pl.concat (#24487)
  • Refactor parametric tests for as_struct on aggstates (#24493)
  • Use PlanCallback in name.map_* (#24484)
  • Pin xlsvwriter to 3.2.5 or before (#24485)
  • Add dataclass to hold resolved iceberg scan data (#24418)
  • Fix iceberg test failure in CI (#24456)
  • Move CompressionUtils to polars-utils (#24430)
  • Update github template to dispatch to cloud client (#24416)

Thank you to all our contributors for making this release possible! @DeflateAwning, @Gusabary, @JakubValtar, @Kevin-Patyk, @MarcoGorelli, @Matt711, @alexander-beedie, @alonsosilvaallende, @borchero, @c-peters, @camriddell, @coastalwhite, @dangotbanned, @deanm0000, @dongchao-1, @dsprenkels, @eitsupi, @itamarst, @jan-krueger, @joshuamarkovic, @juansolm, @kdn36, @moizescbf, @nameexhaustion, @orlp, @r-brink, @ritchie46 and @stijnherfst