Latest Results
fix: update code for nightly-2026-01-08 toolchain
Bump rust nightly toolchain from nightly-2025-09-03 to nightly-2026-01-08.
This picks up the ArrayChunks::into_remainder() return type change
(rust-lang/rust#149127) while staying before the cargo --timings=html
removal (rust-lang/cargo#16420), which CI still relies on.
Changes to accommodate the new nightly:
- Fix into_remainder() call in daft-minhash to match new return type
(no longer wrapped in Option) (This enables distributions to build
with Rust 1.93 stable and newer when RUSTC_BOOTSTRAP is set.)
- Upgrade cargo-llvm-cov from 0.7.1 to 0.8.7 to fix corrupt profraw
files due to LLVM profdata format incompatibility
- Fix clippy use_self lint errors by replacing self-referential type
names with Self in DataType, ArrowSchema, ArrowArray,
ArrowArrayStream, Literal, FakeSchema, FakeArray, Value,
PlanJsonConfig, JoinOrderTree, and Error enum/struct definitions
- Allow clippy::use_self on snafu-derived Error variants that use
Arc<Error> (snafu macro generates code with concrete type names,
incompatible with Self): daft-io CachedError, daft-parquet
RemoteFetchFailed
- Allow clippy::result_large_err in daft-io/src/tos.rs (IO error
types are inherently large)
- Rename clippy::only_used_in_recursion allow attributes to
clippy::self_only_used_in_recursion (lint renamed in newer clippy)
- Fix ref_as_ptr lint in daft-recordbatch test by using
std::ptr::from_ref
- Fix useless_vec lint in daft-io test
- Fix derivable_impls lint on PreviewFormat by using #[derive(Default)]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>mikedep333:info_remainder-new-rust feat: nearest asof joins (#6953)
---
Nearest ASOF Join
Adds strategy="nearest" to join_asof, which matches each left row to the
right row with the minimum absolute difference in the on-key. Ties
prefer the larger (later/forward) value.
---
Changes: Native execution
**Probe**
Previously, we assign each right row to exactly one left row and rely on
a single directional fill to propagate matches. Nearest can't do this,
when two left rows are equidistant from a right row, assigning to only
the nearest means the other never gets to compare that candidate.
The fix is search_bucket_nearest_range: for each right row it returns a
Range<usize> covering both the floor (last left ≤ right) and ceil (first
left ≥ right). Every right row is offered to every position in the range
via update_nearest_match, which keeps the closer candidate.
**Finalize**
After per-worker probe states are merged, nearest_fill resolves
unmatched left rows by running both a forward and backward fill on
copies of global_best, then picking the closer candidate from the two
directions using is_nearer.
**is_nearer**
For each comparison, dispatches on the Arrow DataType once, downcasts to
the concrete PrimitiveArray<T>, then extracts three plain Rust scalars
(candidate A, candidate B, pivot) and computes |a - pivot| vs |b -
pivot|
Chose this approach as type-matching and computing distances as plain
Rust scalars was faster than going through Arrow compute kernels or Daft
Series/array primitives
---
Changes: Distributed execution
We refactored the carryover computation into
compute_carryovers(descending: bool), which runs a top_n(limit=1) pass
over the right table:
- descending=true picks the per-partition max and propagates it
left→right, giving each partition the closest right row from behind;
- descending=false picks the per-partition min and propagates
right→left, giving each partition the closest right row from ahead.
For Nearest, both passes run concurrently via tokio::try_join!. Each
partition's local join task then receives its own right data plus one
boundary row from each direction (two carryovers), and the native
nearest join handles the rest. fix: update code for nightly-2026-01-08 toolchain
Bump rust nightly toolchain from nightly-2025-09-03 to nightly-2026-01-08.
This picks up the ArrayChunks::into_remainder() return type change
(rust-lang/rust#149127) while staying before the cargo --timings=html
removal (rust-lang/cargo#16420), which CI still relies on.
Changes to accommodate the new nightly:
- Fix into_remainder() call in daft-minhash to match new return type
(no longer wrapped in Option) (This enables distributions to build
with Rust 1.93 stable and newer when RUSTC_BOOTSTRAP is set.)
- Upgrade cargo-llvm-cov from 0.7.1 to 0.8.7 to fix corrupt profraw
files due to LLVM profdata format incompatibility
- Fix clippy use_self lint errors by replacing self-referential type
names with Self in DataType, ArrowSchema, ArrowArray,
ArrowArrayStream, Literal, FakeSchema, FakeArray, Value,
PlanJsonConfig, and Error enum/struct definitions
- Allow clippy::use_self on snafu-derived Error variants that use
Arc<Error> (snafu macro generates code with concrete type names,
incompatible with Self): daft-io CachedError, daft-parquet
RemoteFetchFailed
- Allow clippy::result_large_err in daft-io/src/tos.rs (IO error
types are inherently large)
- Allow clippy::self_only_used_in_recursion on HuggingFace request
method (lint renamed from only_used_in_recursion in newer clippy)
- Fix useless_vec lint in daft-io test
- Fix derivable_impls lint on PreviewFormat by using #[derive(Default)]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>mikedep333:info_remainder-new-rust Latest Branches
0%
0%
euan/optimize-window-fn-2 -1%
mikedep333:info_remainder-new-rust © 2026 CodSpeed Technology