cut: fix -s flag ignored when delimiter is newline
When using newline as the delimiter (-d $'\n') with -s (only-delimited),
cut should suppress lines that do not contain the delimiter. However,
cut_fields_newline_char_delim() was not checking the only_delimited flag,
causing it to always output even when -s was specified.
The fix uses read_until() instead of split() to read segments. Unlike
split(), read_until() includes the delimiter in the buffer when found,
allowing us to detect whether a delimiter was actually present or if
we just hit EOF.
This commit:
- Adds only_delimited parameter to cut_fields_newline_char_delim()
- Uses read_until() to detect delimiter presence while reading
- If no delimiter found and only_delimited is true, returns early
- Adds test case for newline delimiter with -s flag
Fixes #10012
3e15738
6 days ago
by rynewang
+0.03%
cut: simplify newline delimiter handling
- Use read_to_end + split instead of read_until loop
- Use Vec<&[u8]> instead of Vec<Vec<u8>> (no extra allocations)
- Rename found_delimiter to has_delimiter
- Check has_delimiter before removing trailing empty segment
- Only write trailing newline if we output something
- Remove unused BufRead import
- Add tests for edge cases: just newline, empty input
526b5d8
5 days ago
by rynewang
+4.1%
cut: use streaming DelimReader for newline delimiter
Address PR review comments:
- Replace read_to_end() with streaming DelimReader iterator
- DelimReader uses read_until() and tracks delimiter per segment
- Only collect selected fields, not entire input
- Remove b"".as_slice() check (DelimReader doesn't create trailing empties)
- Simplify output with split_first() join pattern