Avatar for the Eventual-Inc user
Eventual-Inc
Daft
BlogDocsChangelog

Performance History

Latest Results

ignore corrupt files
chenghuichen:ignore_corrupt
7 minutes ago
feat(functions): add string distance/similarity functions (#7068) ## Changes Made Add four pairwise string distance/similarity functions as pure Rust scalar UDFs: - `levenshtein_distance` - minimum edit distance (Int64) - `jaro_similarity` - similarity score 0.0-1.0 (Float64) - `jaro_winkler_similarity` - Jaro with prefix bonus (Float64) - `damerau_levenshtein_distance` - Levenshtein + transpositions (Int64) Follows the existing `hamming_distance_str` pattern. No external dependencies. Exposed via `daft.functions` API and as Expression methods. Null-safe (returns null when either input is null). ## Related Issues Fixes #6794 ## Test Plan 24 pytest test cases in `tests/functions/test_string_distance.py`: - **Levenshtein** (6 tests): basic edit distance, empty strings, null handling, identical strings, single-char edits (substitution/insertion/deletion), expression method - **Jaro** (6 tests): identical strings, completely different strings, known reference values (martha/marhta = 0.944444), null handling, empty vs nonempty, expression method - **Jaro-Winkler** (6 tests): identical strings, prefix bonus >= Jaro invariant, known reference values (martha/marhta = 0.961111), no common prefix (JW == Jaro), null handling, expression method - **Damerau-Levenshtein** (6 tests): basic transposition, transposition vs standard Levenshtein (ab/ba = 1 vs 2), empty strings, identical strings, null handling, expression method ``` DAFT_RUNNER=native pytest tests/functions/test_string_distance.py -v 24 passed in 0.06s ``` Rust compilation verified: ``` cargo check --workspace # zero errors cargo clippy -p daft-functions-utf8 --no-deps # zero warnings on new code ``` ## AI Disclosure AI-assisted implementation (Claude Opus 4.6).
main
6 hours ago
Merge branch 'main' into issue-2423
Lucas61000:issue-2423
13 hours ago
test: cover untyped-decorator in pytest fixture type:ignore
XuQianJin-Stars:fix/transformers-classifier-tests-hf-429
16 hours ago
fix(parquet): cast map values to explicit schema
jackylee-ch:codex-fix-parquet-map-schema-cast
18 hours ago

Latest Branches

CodSpeed Performance Gauge
0%
feat: add ignore_corrupt_files option to read_parquet, read_csv and read_iceberg#6520
4 days ago
fc2b3ed
chenghuichen:ignore_corrupt
CodSpeed Performance Gauge
-1%
14 hours ago
a2eb798
Lucas61000:issue-2423
CodSpeed Performance Gauge
0%
16 hours ago
06937b6
XuQianJin-Stars:fix/transformers-classifier-tests-hf-429
Β© 2026 CodSpeed Technology
Home Terms Privacy Docs