Polars: rs-0.53.0 Release

Release date:
February 9, 2026
Previous version:
rs-0.52.0 (released November 3, 2025)
Magnitude:
60,376 Diff Delta
Contributors:
33 total committers
Data confidence:
Commits:

435 Commits in this Release

Ordered by the degree to which they evolved the repo in this version.

Authored January 12, 2026
Authored November 10, 2025
Authored November 18, 2025

Top Contributors in rs-0.53.0

nameexhaustion
orlp
alexander-beedie
coastalwhite
ritchie46
kdn36
mcrumiller
guilhem-dvr
wtn
ion-elgreco

Directory Browser for rs-0.53.0

All files are compared to previous version, rs-0.52.0. Click here to browse diffs between other versions.

Loading File Browser...

Release Notes Published

πŸ† Highlights

  • Add Extension types (#25322)

πŸš€ Performance improvements

  • Don't always rechunk on gather of nested types (#26478)
  • Enable zero-copy object_store put upload for IPC sink (#26288)
  • Resolve file schema's and metadata concurrently (#26325)
  • Run elementwise CSEE for the streaming engine (#26278)
  • Disable morsel splitting for fast-count on streaming engine (#26245)
  • Implement streaming decompression for scan_ndjson and scan_lines (#26200)
  • Improve string slicing performance (#26206)
  • Refactor scan_delta to use python dataset interface (#26190)
  • Add dedicated kernel for group-by arg_max/arg_min (#26093)
  • Add streaming merge-join (#25964)
  • Generalize Bitmap::new_zeroed opt for Buffer::zeroed (#26142)
  • Reduce fs stat calls in path expansion (#26173)
  • Lower streaming group_by n_unique to unique().len() (#26109)
  • Speed up SQL interface "UNION" clauses (#26039)
  • Speed up SQL interface "ORDER BY" clauses (#26037)
  • Add fast kernel for is_nan and use it for numpy NaN->null conversion (#26034)
  • Optimize ArrayFromIter implementations for ObjectArray (#25712)
  • New streaming NDJSON sink pipeline (#25948)
  • New streaming CSV sink pipeline (#25900)
  • Dispatch partitioned usage of sink_* functions to new-streaming by default (#25910)
  • Replace ryu with faster zmij (#25885)
  • Reduce memory usage for .item() count in grouped first/last (#25787)
  • Skip schema inference if schema provided for scan_csv/ndjson (#25757)
  • Add width-aware chunking to prevent degradation with wide data (#25764)
  • Use new sink pipeline for write/sink_ipc (#25746)
  • Reduce memory usage when scanning multiple parquet files in streaming (#25747)
  • Don't call cluster_with_columns optimization if not needed (#25724)
  • Tune partitioned sink_parquet cloud performance (#25687)
  • New single file IO sink pipeline enabled for sink_parquet (#25670)
  • New partitioned IO sink pipeline enabled for sink_parquet (#25629)
  • Correct overly eager local predicate insertion for unpivot (#25644)
  • Reduce HuggingFace API calls (#25521)
  • Use strong hash instead of traversal for CSPE equality (#25537)
  • Fix panic in is_between support in streaming Parquet predicate push down (#25476)
  • Faster kernels for rle_lengths (#25448)
  • Allow detecting plan sortedness in more cases (#25408)
  • Enable predicate expressions on unsigned integers (#25416)
  • Mark output of more non-order-maintaining ops as unordered (#25419)
  • Fast find start window in group_by_dynamic with large offset (#25376)
  • Add streaming native LazyFrame.group_by_dynamic (#25342)
  • Add streaming sorted Group-By (#25013)
  • Add parquet prefiltering for string regexes (#25381)
  • Use fast path for agg_min/agg_max when nulls present (#25374)
  • Fuse positive slice into streaming LazyFrame.rolling (#25338)
  • Mark Expr.reshape((-1,)) as row separable (#25326)
  • Use bitmap instead of Vec<bool> in first/last w. skip_nulls (#25318)
  • Return references from aexpr_to_leaf_names_iter (#25319)

✨ Enhancements

  • Add primitive filter -> agg lowering in streaming GroupBy (#26459)
  • Support for the SQL FETCH clause (#26449)
  • Add get() to retrieve a byte from binary data (#26454)
  • Remove with_context in SQL lowering (#26416)
  • Avoid OOM for scan_ndjson and scan_lines if input is compressed and negative slice (#26396)
  • Add JoinBuildSide (#26403)
  • Support annoymous agg in-mem (#26376)
  • Add unstable arrow_schema parameter to sink_parquet (#26323)
  • Improve error message formatting for structs (#26349)
  • Remove parquet field overwrites (#26236)
  • Enable zero-copy object_store put upload for IPC sink (#26288)
  • Improved disambiguation for qualified wildcard columns in SQL projections (#26301)
  • Expose upload_concurrency through env var (#26263)
  • Allow quantile to compute multiple quantiles at once (#25516)
  • Allow empty LazyFrame in LazyFrame.group_by(...).map_groups (#26275)
  • Use delta file statistics for batch predicate pushdown (#26242)
  • Add streaming UnorderedUnion (#26240)
  • Implement compression support for sink_ndjson (#26212)
  • Add unstable record batch statistics flags to {sink/scan}_ipc (#26254)
  • Cloud retry/backoff configuration via storage_options (#26204)
  • Use same sort order for expanded paths across local / cloud / directory / glob (#26191)
  • Expose physical plan NodeStyle (#26184)
  • Add streaming merge-join (#25964)
  • Serialize optimization flags for cloud plan (#26168)
  • Add compression support to write_csv and sink_csv (#26111)
  • Add scan_lines (#26112)
  • Support regex in str.split (#26060)
  • Add unstable IPC Statistics read/write to scan_ipc/sink_ipc (#26079)
  • Add nulls support for all rolling_by operations (#26081)
  • ArrowStreamExportable and sink_delta (#25994)
  • Release musl builds (#25894)
  • Implement streaming decompression for CSV COUNT(*) fast path (#25988)
  • Add nulls support for rolling_mean_by (#25917)
  • Add lazy collect_all (#25991)
  • Add streaming decompression for NDJSON schema inference (#25992)
  • Improved handling of unqualified SQL JOIN columns that are ambiguous (#25761)
  • Expose record batch size in {sink,write}_ipc (#25958)
  • Add null_on_oob parameter to expr.get (#25957)
  • Suggest correct timezone if timezone validation fails (#25937)
  • Support streaming IPC scan from S3 object store (#25868)
  • Implement streaming CSV schema inference (#25911)
  • Support hashing of meta expressions (#25916)
  • Improve SQLContext recognition of possible table objects in the Python globals (#25749)
  • Add pl.Expr.(min|max)_by (#25905)
  • Improve MemSlice Debug impl (#25913)
  • Implement or fix json encode/decode for (U)Int128, Categorical, Enum, Decimal (#25896)
  • Expand scatter to more dtypes (#25874)
  • Implement streaming CSV decompression (#25842)
  • Add Series sql method for API consistency (#25792)
  • Mark Polars as safe for free-threading (#25677)
  • Support Binary and Decimal in arg_(min|max) (#25839)
  • Allow Decimal parsing in str.json_decode (#25797)
  • Add shift support for Object data type (#25769)
  • Add node status to NodeMetrics (#25760)
  • Allow scientific notation when parsing Decimals (#25711)
  • Allow creation of Object literal (#25690)
  • Don't collect schema in SQL union processing (#25675)
  • Add bin.slice(), bin.head(), and bin.tail() methods (#25647)
  • Add SQL support for the QUALIFY clause (#25652)
  • New partitioned IO sink pipeline enabled for sink_parquet (#25629)
  • Add SQL syntax support for CROSS JOIN UNNEST(col) (#25623)
  • Add separate env var to log tracked metrics (#25586)
  • Expose fields for generating physical plan visualization data (#25562)
  • Allow pl.Object in pivot value (#25533)
  • Extend SQL UNNEST support to handle multiple array expressions (#25418)
  • Minor improvement for as_struct repr (#25529)
  • Temporal quantile in rolling context (#25479)
  • Add support for Float16 dtype (#25185)
  • Add strict parameter to pl.concat(how='horizontal') (#25452)
  • Add leftmost option to str.replace_many / str.find_many / str.extract_many (#25398)
  • Add quantile for missing temporals (#25464)
  • Expose and document pl.Categories (#25443)
  • Support decimals in search_sorted (#25450)
  • Use reference to Graph pipes when flushing metrics (#25442)
  • Add SQL support for named WINDOW references (#25400)
  • Add Extension types (#25322)
  • Add having to group_by context (#23550)
  • Allow elementwise Expr.over in aggregation context (#25402)
  • Add SQL support for ROW_NUMBER, RANK, and DENSE_RANK functions (#25409)
  • Automatically Parquet dictionary encode floats (#25387)
  • Add empty_as_null and keep_nulls to {Lazy,Data}Frame.explode (#25369)
  • Allow hash for all List dtypes (#25372)
  • Support unique_counts for all datatypes (#25379)
  • Add maintain_order to Expr.mode (#25377)
  • Display function of streaming physical plan map node (#25368)
  • Allow slice on scalar in aggregation context (#25358)
  • Allow implode and aggregation in aggregation context (#25357)
  • Add empty_as_null and keep_nulls flags to Expr.explode (#25289)
  • Add ignore_nulls to first / last (#25105)
  • Move GraphMetrics into StreamingQuery (#25310)
  • Allow Expr.unique on List/Array with non-numeric types (#25285)
  • Allow Expr.rolling in aggregation contexts (#25258)
  • Support additional forms of SQL CREATE TABLE statements (#25191)
  • Add LazyFrame.pivot (#25016)
  • Support column-positional SQL UNION operations (#25183)
  • Allow arbitrary expressions as the Expr.rolling index_column (#25117)
  • Allow arbitrary Expressions in "subset" parameter of unique frame method (#25099)
  • Support arbitrary expressions in SQL JOIN constraints (#25132)

🐞 Bug fixes

  • Do not overwrite used names in cluster_with_columns pushdown (#26467)
  • Do not mark output of concat_str on multiple inputs as sorted (#26468)
  • Fix CSV schema inference content line duplication bug (#26452)
  • Fix InvalidOperationError using scan_delta with filter (#26448)
  • Alias giving missing column after streaming GroupBy CSE (#26447)
  • Ensure by_name selector selects only names (#26437)
  • Restore compatibility of strings written to parquet with pyarrow filter (#26436)
  • Update schema in cluster_with_columns optimization (#26430)
  • Fix negative slice in groups slicing (#26442)
  • Don't run CPU check on aarch64 musl (#26439)
  • Remove the POLARS_IDEAL_MORSEL_SIZE monkeypatching in the parametric merge-join test (#26418)
  • Correct off-by-one in RLE row counting for nullable dictionary-encoded columns (#26411)
  • Support very large integers in env var limits (#26399)
  • Fix PlPath panic from incorrect slicing of UTF8 boundaries (#26389)
  • Fix Float dtype for spearman correlation (#26392)
  • Fix optimizer panic in right joins with type coercion (#26365)
  • Don't serialize retry config from local environment vars (#26289)
  • Fix PartitionBy with scalar key expressions and diff() (#26370)
  • Add {Float16, Float32} -> Float32 lossless upcast (#26373)
  • Fix panic using with_columns and collect_all (#26366)
  • Add multi-page support for writing dictionary-encoded Parquet columns (#26360)
  • Ensure slice advancement when skipping non-inlinable values in is_in with inlinable needles (#26361)
  • Pin xlsx2csv version temporarily (#26352)
  • Bugs in ViewArray total_bytes_len (#26328)
  • Overflow in i128::abs in Decimal fits check (#26341)
  • Make Expr.hash on Categorical mapping-independent (#26340)
  • Clone shared GroupBy node before mutation in physical plan creation (#26327)
  • Fix lazy evaluation of replace_strict by making it fallible (#26267)
  • Consider the "current location" of an item when computing rolling_rank_by (#26287)
  • Reset is_count_star flag between queries in collect_all (#26256)
  • Fix incorrect is_between filter on scan_parquet (#26284)
  • Lower AnonymousStreamingAgg in group-by as aggregate (#26258)
  • Avoid overflow in pl.duration scalar arguments case (#26213)
  • Broadcast arr.get on single array with multiple indices (#26219)
  • Fix panic on CSPE with sorts (#26231)
  • Fix UB in DataFrame::transpose_from_dtype (#26203)
  • Eager DataFrame.slice with negative offset and length=None (#26215)
  • Use correct schema side for streaming merge join lowering (#26218)
  • Implement expression keys for merge-join (#26202)
  • Overflow panic in scan_csv with multiple files and skip_rows + n_rows larger than total row count (#26128)
  • Respect allow_object flag after cache (#26196)
  • Raise error on non-elementwise PartitionBy keys (#26194)
  • Allow ordered categorical dictionary in scan_parquet (#26180)
  • Allow excess bytes on IPC bitmap compressed length (#26176)
  • Address buggy quadratic scaling fix in scan_csv (#26175)
  • Address a macOS-specific compile issue (#26172)
  • Fix deadlock on hash_rows() of 0-width DataFrame (#26154)
  • Fix NameError filtering pyarrow dataset (#26166)
  • Fix concat_arr panic when using categoricals/enums (#26146)
  • Fix NDJSON/scan_lines negative slice splitting with extremely long lines (#26132)
  • Incorrect group_by min/max fast path (#26139)
  • Remove a source of non-determinism from lowering (#26137)
  • Error when with_row_index or unpivot create duplicate columns on a LazyFrame (#26107)
  • Panics on shift with head (#26099)
  • Optimize slicing support on compressed IPC (#26071)
  • CPU check for musl builds (#26076)
  • Fix slicing on compressed IPC (#26066)
  • Release GIL on collect_batches (#26033)
  • Missing buffer update in String is_in Parquet pushdown (#26019)
  • Make struct.with_fields data model coherent (#25610)
  • Incorrect output order for order sensitive operations after join_asof (#25990)
  • Use SeriesExport for pyo3-polars FFI (#26000)
  • Don't write Parquet min/max statistics for i128 (#25986)
  • Ensure chunk consistency in in-memory join (#25979)
  • Fix varying block metadata length in IPC reader (#25975)
  • Implement collect_batches properly in Rust (#25918)
  • Fix panic on arithmetic with bools in list (#25898)
  • Convert to index type with strict cast in some places (#25912)
  • Empty dataframe in streaming non-strict hconcat (#25903)
  • Infer large u64 in json as i128 (#25904)
  • Set http client timeouts to 10 minutes (#25902)
  • Prevent panic when comparing Date with Duration types (#25856)
  • Correct lexicographic ordering for Parquet BYTE_ARRAY statistics (#25886)
  • Raise error on duplicate group_by names in upsample() (#25811)
  • Correctly export view buffer sizes nested in Extension types (#25853)
  • Fix DataFrame.estimated_size not handling overlapping chunks correctly (#25775)
  • Ensure Kahan sum does not introduce NaN from infinities (#25850)
  • Trim excess bytes in parquet decode (#25829)
  • Reshape checks size to match exactly (#25571)
  • Fix panic/deadlock sinking parquet with rows larger than 64MB estimated size (#25836)
  • Fix quantile midpoint interpolation (#25824)
  • Don't use cast when converting from physical in list.get (#25831)
  • Invalid null count on int -> categorical cast (#25816)
  • Update groups in list.eval (#25826)
  • Use downcast before FFI conversion in PythonScan (#25815)
  • Double-counting of row metrics (#25810)
  • Cast nulls to expected type in streaming union node (#25802)
  • Incorrect slice pushdown into map_groups (#25809)
  • Fix panic writing parquet with single bool column (#25807)
  • Fix upsample with group_by incorrectly introduced NULLs on group key columns (#25794)
  • Panic in top_k pruning (#25798)
  • Fix documentation for new() (#25791)
  • Fix incorrect collect_schema for unpivot followed by join (#25782)
  • Fix documentation for tail() (#25784)
  • Verify arr namespace is called from array column (#25650)
  • Ensure LazyFrame.serialize() unchanged after collect_schema() (#25780)
  • Function map_(rows|elements) with return_dtype = pl.Object (#25753)
  • Avoid visiting nodes multiple times in PhysicalPlanVisualizationDataGenerator (#25737)
  • Fix incorrect cargo sub-feature (#25738)
  • Fix deadlock on empty scan IR (#25716)
  • Don't invalidate node in cluster-with-columns (#25714)
  • Move boto3 extra from s3fs in dev requirements (#25667)
  • Binary slice methods missing from Series and docs (#25683)
  • Mix-up of variable_name/value_name in unpivot (#25685)
  • Invalid usage of drop_first in to_dummies when nulls present (#25435)
  • Rechunk on nested dtypes in take_unchecked_impl parallel path (#25662)
  • New single file IO sink pipeline enabled for sink_parquet (#25670)
  • Fix streaming SchemaMismatch panic on list.drop_nulls (#25661)
  • Correct overly eager local predicate insertion for unpivot (#25644)
  • Fix "dtype is unknown" panic in cross joins with literals (#25658)
  • Fix panic on Boolean rolling_sum calculation for list or array eval (#25660)
  • Preserve List inner dtype during chunked take operations (#25634)
  • Fix panic edge-case when scanning hive partitioned data (#25656)
  • Fix lifetime for AmortSeries lazy group iterator (#25620)
  • Improve SQL GROUP BY and ORDER BY expression resolution, handling aliasing edge-cases (#25637)
  • Fix empty format handling (#25638)
  • Prevent false positives in is_in for large integers (#25608)
  • Optimize projection pushdown through HConcat (#25371)
  • Differentiate between empty list an no list for unpivot (#25597)
  • Properly resolve HAVING clause during SQL GROUP BY operations (#25615)
  • Fix spearman panicking on nulls (#25619)
  • Increase precision when constructing float Series (#25323)
  • Make sum on strings error in group_by context (#25456)
  • Hang in multi-chunk DataFrame .rows() (#25582)
  • Bug in boolean unique_counts (#25587)
  • Set Float16 parquet schema type to Float16 (#25578)
  • Correct arr_to_any_value for object arrays (#25581)
  • Have PySeries::new_f16 receive pf16s instead of f32s (#25579)
  • Fix occurence of exact matches of .join_asof(strategy="nearest", allow_exact_matches=False, ...) (#25506)
  • Raise error on out-of-range dates in temporal operations (#25471)
  • Fix incorrect .list.eval after slicing operations (#25540)
  • Reduce HuggingFace API calls (#25521)
  • Strict conversion AnyValue to Struct (#25536)
  • Fix panic in is_between support in streaming Parquet predicate push down (#25476)
  • Always respect return_dtype in map_elements and map_rows (#25504)
  • Rolling mean/median for temporals (#25512)
  • Add .rolling_rank() support for temporal types and pl.Boolean (#25509)
  • Fix dictionary replacement error in write_ipc() (#25497)
  • Fix group lengths check in sort_by with AggregatedScalar (#25503)
  • Fix expr slice pushdown causing shape error on literals (#25485)
  • Allow empty list in sort_by in list.eval context (#25481)
  • Prevent panic when joining sorted LazyFrame with itself (#25453)
  • Apply CSV dict overrides by name only (#25436)
  • Incorrect result in aggregated first/last with ignore_nulls (#25414)
  • Fix off-by-one bug in ColumnPredicates generation for inequalities operating on integer columns (#25412)
  • Fix arr.{eval,agg} in aggregation context (#25390)
  • Support AggregatedList in list.{eval,agg} context (#25385)
  • Improve SQL UNNEST behaviour (#22546)
  • Remove ClosableFile (#25330)
  • Use Cargo.template.toml to prevent git dependencies from using template (#25392)
  • Resolve edge-case with SQL aggregates that have the same name as one of the GROUP BY keys (#25362)
  • Revert pl.format behavior with nulls (#25370)
  • Remove Expr casts in pl.lit invocations (#25373)
  • Nested dtypes in streaming first_non_null/last_non_null (#25375)
  • Correct eq_missing for struct with nulls (#25363)
  • Unique on literal in aggregation context (#25359)
  • Allow implode and aggregation in aggregation context (#25357)
  • Aggregation with drop_nulls on literal (#25356)
  • Address multiple issues with SQL OVER clause behaviour for window functions (#25249)
  • Schema mismatch with list.agg, unique and scalar (#25348)
  • Correct drop_items for scalar input (#25351)
  • SQL NATURAL joins should coalesce the key columns (#25353)
  • Mark {forward,backward}_fill as length_preserving (#25352)
  • Nested dtypes in streaming first/last (#25298)
  • AnyValue::to_physical for categoricals (#25341)
  • Fix link errors reported by markdown-link-check (#25314)
  • Parquet is_in for mixed validity pages (#25313)
  • Fix length preserving check for eval expressions in streaming engine (#25294)
  • Fix building polars-plan with features lazy,concat_str (but no strings) (#25306)
  • Fix building polars-mem-engine with the async feature (#25300)
  • Don't quietly allow unsupported SQL SELECT clauses (#25282)
  • Fix small bug with PyExpr to PyObject conversion (#25265)
  • Reverse on chunked struct (#25281)
  • Panic exception when calling Expr.rolling in .over (#25283)
  • Correct {first,last}_non_null if there are empty chunks (#25279)
  • Incorrect results for aggregated {n_,}unique on bools (#25275)
  • Fix building polars-expr without timezones feature (#25254)
  • Ensure out-of-range integers and other edge case values don't give wrong results for index_of() (#24369)
  • Correctly prune projected columns in hints (#25250)
  • Allow Null dtype values in scatter (#25245)
  • Correct handle requested stops in streaming shift (#25239)
  • Make str.json_decode output deterministic with lists (#25240)
  • Wide-table join performance regression (#25222)
  • Fix single-column CSV header duplication with leading empty lines (#25186)
  • Enhanced column resolution/tracking through multi-way SQL joins (#25181)
  • Fix serialization of lazyframes containing huge tables (#25190)
  • Use (i64, u64) for VisualizationData (offset, length) slices (#25203)
  • Fix assertion panic on group_by (#25179)
  • Fix format_str in case of multiple chunks (#25162)
  • Fix incorrect drop_nans() result when used in group_by() / over() (#25146)

πŸ“– Documentation

  • Fix typo in max_by docstring (#26404)
  • Remove deprecated cublet_id (#26260)
  • Update for new release (#26255)
  • Update MCP server section with new URL (#26241)
  • Fix unmatched paren and punctuation in pandas migration guide (#26251)
  • Add observatory database_path to docs (#26201)
  • Note plugins in Python user-defined functions (#26138)
  • Clarify min_by/max_by behavior on ties (#26077)
  • Add QUALIFY clause and SUBSTRING function to the SQL docs (#25779)
  • Update mixed-offset datetime parsing example in user guide (#25915)
  • Update bare-metal docs for mounted anonymous results (#25801)
  • Fix credential parameter name in cloud-storage.py (#25788)
  • Configuration options update (#25756)
  • Fix typos in Excel and Pandas migration guides (#25709)
  • Add "right" to how options in join() docstrings (#25678)
  • Document schema parameter in meta methods (#25543)
  • Correct link to datetime_range instead of date_range in resampling page (#25532)
  • Explain aggregation & sorting of lists (#25260)
  • Update LazyFrame.collect_schema() docstring (#25508)
  • Remove lzo from parquet write options (#25522)
  • Update on-premise documentation (#25489)
  • Fix incorrect 'bitwise' in any_horizontal/all_horizontal docstring (#25469)
  • Add Extension and BaseExtension to doc index (#25444)
  • Add polars-on-premise documentation (#25431)
  • Fix link errors reported by markdown-link-check (#25314)
  • Fix LanceDB URL (#25198)

πŸ“¦ Build system

  • Address remaining Python 3.14 issues with make requirements-all (#26195)
  • Address a macOS-specific compile issue (#26172)
  • Fix make fmt and make lint commands (#25200)

πŸ› οΈ Other improvements

  • Move IO source metrics instrumentation to PolarsObjectStore (#26414)
  • More SQL to IR conversion execute_isolated (#26455)
  • Cleanup unused attributes in optimizer (#26464)
  • Use Expr::Display as catch all for IR - DSL asymmetry (#26471)
  • Remove the POLARS_IDEAL_MORSEL_SIZE monkeypatching in the parametric merge-join test (#26418)
  • Move IO metrics struct to polars-io and use new timer (#26397)
  • Reduce blocking on computational executor threads in multiscan init (#26407)
  • Cleanup the parametric merge-join test (#26413)
  • Ensure local doctests skip from_torch if module not installed (#26405)
  • Implement various deprecations (#26314)
  • Refactor MinBy and MaxBy as IRFunctions (#26307)
  • Rename Operator::Divide to RustDivide (#26339)
  • Properly disable the Pyodide tests (#26382)
  • Add LiveTimer (#26384)
  • Use derived serialization on PlRefPath (#26167)
  • Add metadata to ArrowSchema struct (#26318)
  • Remove unused field (#26367)
  • Fix runtime nesting (#26359)
  • Remove xlsx2csv dependency pin (#26355)
  • Allow unchecked IPC reads (#26354)
  • Use outer runtime if exists in to_alp (#26353)
  • Make CategoricalMapping::new pub(crate) to avoid misuse (#26308)
  • Clarify IPC buffer read limit/length paramter (#26334)
  • Improve accuracy of active IO time metric (#26315)
  • Mark VarState as repr(C) (#26309)
  • IO metrics for streaming Parquet / IPC sources (#26300)
  • Replace panicking index access with error handling in dictionaries_to_encode (#26059)
  • Remove unnecessary match and move early return in testing (#26297)
  • Add dtype test coverage for delta predicate filter (#26291)
  • Add property-based tests for Scalar::cast_with_options (#25744)
  • Add AI policy (#26286)
  • Remove MemSlice (#26259)
  • Remove recursion from upsample_impl (#26250)
  • Remove all non CSV fast-count paths (#26233)
  • Replace MemReader with Cursor (#26216)
  • Add serde(default) to new CSV compression fields (#26210)
  • Add a couple of SAFETY comments in merge-join node (#26197)
  • Expose physical plan NodeStyle (#26184)
  • Ensure optimization flag modification happens local (#26185)
  • Use NullChunked as default for Series (#26181)
  • In merge-sorted node, when buffering, request a stop on *every* unbuffered morsel (#26178)
  • Rename io_sinks2 -> io_sinks (#26159)
  • Lint leftover fixme (#26122)
  • Move Buffer and SharedStorage to polars-buffer crate (#26113)
  • Remove old sink IR (#26130)
  • Use derived serialization on PlRefPath (#26062)
  • Improve backtrace for POLARS_PANIC_ON_ERR (#26125)
  • Fix Python docs build (#26117)
  • Remove old streaming sink implementation (#26102)
  • Disable unused-ignore mypy lint (#26110)
  • Remove unused equality impls for IR / FunctionIR (#26106)
  • Ignore mypy warning (#26105)
  • Preserve order for string concatenation (#26101)
  • Raise error on file://hostname/path (#26061)
  • Disable debug info for docs workflow (#26086)
  • Remove IR / physical plan visualization data generators (#26090)
  • Update docs for next polars cloud release (#26091)
  • Support Python 3.14 in dev environment (#26073)
  • Mark top slow normal tests as slow (#26080)
  • Simplify PlPath (#26053)
  • Update breaking deps (#26055)
  • Fix for upstream url bug and update deps (#26052)
  • Properly pin chrono (#26051)
  • Don't run rust doctests (#26046)
  • Update deps (#26042)
  • Ignore very slow test (#26041)
  • Add Send bound for SharedStorage owner (#26040)
  • Update rust compiler (#26017)
  • Improve csv test coverage (#25980)
  • Use from_any_values_and_dtype in Series::extend_constant (#26006)
  • Pass sync_on_close and num_pipelines via start_file_writer for IO sinks (#25950)
  • Add broadcast_nulls field to RowEncodingVariant and _get_rows_encoded_{ca,arr} (#26001)
  • Ramp up CSV read size (#25997)
  • Rename FileType to FileWriteFormat (#25951)
  • Don't unwrap first sink morsel send (#25981)
  • Update ruff action and simplify version handling (#25940)
  • Cleanup Rust DataFrame interface (#25976)
  • Export PhysNode related struct (#25987)
  • Restructure Sort variant in logical and physical plans visualization data (#25978)
  • Run python lint target as part of pre-commit (#25982)
  • Allow multiple inputs to streaming GroupBy node (#25961)
  • Disable HTTP timeout for receiving response body (#25970)
  • Add AI contribution policy (#25956)
  • Remove unused sink code (#25949)
  • Add detailed Sink info to IRNodeProperties (#25954)
  • Wrap FileScanIR::Csv enum variant in Arc (#25952)
  • Use PlSmallStr for CSV format strings (#25901)
  • Add unsafe bound to MemSlice::from_arc (#25920)
  • Improve MemSlice Debug impl (#25913)
  • Remove manual cmp impls for &[u8] (#25890)
  • Remove and deprecate batched csv reader (#25884)
  • Remove unused AnonymousScan functions (#25872)
  • Use Buffer<T> instead of Arc<[T]> to store stringview buffers (#25870)
  • Add TakeableRowsProvider for IO sinks (#25858)
  • Filter DeprecationWarning from pyparsing indirectly through pyiceberg (#25854)
  • Various small improvements (#25835)
  • Clear venv with appropriate version of Python (#25851)
  • Move CSV write logic to CsvSerializer (#25828)
  • Ensure Polars Object extension type is registered (#25813)
  • Harden Python object process ID (#25812)
  • Skip schema inference if schema provided for scan_csv/ndjson (#25757)
  • Ensure proper async connection cleanup on DB test exit (#25766)
  • Flip has_residual_predicate -> no_residual_predicate (#25755)
  • Track original length before file filtering in scan IR (#25717)
  • Ensure we uninstall other Polars runtimes in CI (#25739)
  • Make 'make requirements' more robust (#25693)
  • Remove duplicate compression level types (#25723)
  • Replace async blocks with named components in new parquet write pipeline (#25695)
  • Move Object lit fix earlier in the function (#25713)
  • Remove unused decimal file (#25701)
  • Move boto3 extra from s3fs in dev requirements (#25667)
  • Upgrade to latest version of sqlparser-rs (#25673)
  • Update slab to version without RUSTSEC (#25686)
  • Fix typo (#25684)
  • Avoid rechunk requirement for Series.iter() (#25603)
  • Use dtype for group_aware evaluation on ApplyExpr (#25639)
  • Make polars-plan constants more consistent (#25645)
  • Add "panic" and "streaming" tagging to issue-labeler workflow (#25657)
  • Add support for multi-column reductions (#25640)
  • Fix rolling kernel dispatch with monotonic group attribute (#25494)
  • Simplify _write_any_value (#25622)
  • Ensure we hash all attributes and visit all children in traverse_and_hash_aexpr (#25627)
  • Ensure literal-only SELECT broadcast conforms to SQL semantics (#25633)
  • Add parquet file write pipeline for new IO sinks (#25618)
  • Rename polars-on-premise to polars-on-premises (#25617)
  • Constrain new issue-labeler workflow to the Issue title (#25614)
  • Add streaming IO sink components (#25594)
  • Help categorise Issues by automatically applying labels (using the same patterns used for labelling PRs) (#25599)
  • Show on streaming engine (#25589)
  • Add arg_sort() and Writeable::as_buffered() (#25583)
  • Take task priority argument in parallelize_first_to_local (#25563)
  • Skip existing files in pypi upload (#25576)
  • Fix template path in release-python workflow (#25565)
  • Skip rust integration tests for coverage in CI (#25558)
  • Add asserts and tests for list.eval on multiple chunks with slicing (#25559)
  • Rename URL_ENCODE_CHARSET to HIVE_ENCODE_CHARSET (#25554)
  • Add assert_sql_matches coverage for SQL DISTINCT and DISTINCT ON syntax (#25440)
  • Use strong hash instead of traversal for CSPE equality (#25537)
  • Update partitioned sink IR (#25524)
  • Print expected DSL schema hashes if mismatched (#25526)
  • Remove verbose prints on file opens (#25523)
  • Add proptest AnyValue strategies (#25510)
  • Fix --uv argument for benchmark-remote (#25513)
  • Add proptest DataFrame strategy (#25446)
  • Run maturin with --uv option (#25490)
  • Remove some dead argminmax impl code (#25501)
  • Fix feature gating TZ_AWARE_RE again (#25493)
  • Take sync parameter in Writeable::close() (#25475)
  • Fix unsoundness in ChunkedArray::{first, last} (#25449)
  • Add some cleanup (#25445)
  • Test for group_by(...).having(...) (#25430)
  • Accept multiple files in pipe_with_schema (#25388)
  • Remove aggregation context Context (#25424)
  • Take &dyn Any instead of Box<dyn Any> in python object converters (#25421)
  • Refactor sink IR (#25308)
  • Remove ClosableFile (#25330)
  • Remove debug file write from test suite (#25393)
  • Add ElementExpr for _eval expressions (#25199)
  • Dispatch Series.set to zip_with_same_dtype (#25327)
  • Better coverage for group_by aggregations (#25290)
  • Add oneshot channel to polars-stream (#25378)
  • Enable more streaming tests (#25364)
  • Remove Column::Partitioned (#25324)
  • Remove incorrect cast in reduce code (#25321)
  • Add toolchain file to runtimes for sdist (#25311)
  • Remove PyPartitioning (#25303)
  • Directly take CloudScheme in parse_cloud_options() (#25304)
  • Refactor dt_range functions (#25225)
  • Fix typo in CI release workflow (#25309)
  • Use dedicated runtime packages from template (#25284)
  • Add proptest strategies for Series nested types (#25220)
  • Simplify sink parameter passing from Python (#25302)
  • Add test for unique with column subset (#25241)
  • Fix Decimal precision annotation (#25227)
  • Add LazyFrame.pivot (#25016)
  • Clean up CSPE callsite (#25215)
  • Avoid relabelling changes-dsl on every commit (#25216)
  • Move ewm variance code to polars-compute (#25188)
  • Upgrade to schemars 0.9.0 (#25158)
  • Update markdown link checker (#25201)
  • Automatically label pull requests that change the DSL (#25177)
  • Add reliable test for pl.format on multiple chunks (#25164)
  • Move supertype determination and casting to IR for date_range and related functions (#24084)
  • Make python docs build again (#25165)
  • Make pipe_with_schema work on Arced schema (#25155)
  • Add functions for scan_lines (#25136)
  • Remove lower_ir conversion from Scan to InMemorySource (#25150)
  • Update versions (#25141)

Thank you to all our contributors for making this release possible! @AndreaBozzo, @Atarust, @DannyStoll1, @EndPositive, @JakubValtar, @Jesse-Bakker, @Kevin-Patyk, @LeeviLindgren, @MarcoGorelli, @Matt711, @MrAttoAttoAtto, @TNieuwdorp, @Voultapher, @WaffleLapkin, @agossard, @alex-gregory-ds, @alexander-beedie, @anosrepenilno, @arlyon, @azimafroozeh, @bayoumi17m, @borchero, @c-peters, @cBournhonesque, @camriddell, @carnarez, @cmdlineluser, @coastalwhite, @cr7pt0gr4ph7, @davanstrien, @davidia, @dependabot[bot], @dsprenkels, @edizeqiri, @eitanf, @etiennebacher, @feliblo, @gab23r, @guilhem-dvr, @hallmason17, @hamdanal, @henryharbeck, @hutch3232, @ion-elgreco, @itamarst, @jamesfricker, @jannickj, @jetuk, @jqnatividad, @kdn36, @lun3x, @m1guelperez, @marinegor, @mcrumiller, @nameexhaustion, @orlp, @pomo-mondreganto, @qxzcode, @r-brink, @ritchie46, @sachinn854, @stijnherfst, @sweb, @tlauli, @vyasr, @wtn, @yonikremer and dependabot[bot]