Eventual-Inc
Daft
BlogDocsChangelog

Branches performance

Pull requests

WIP: feat: Add BFloat16 data type support backed by Float32 physical storage#5887
last run
12 hours ago
feat: Add BFloat16 data type support backed by Float32 physical storage This commit introduces native support for the BFloat16 data type in Daft, addressing performance bottlenecks associated with using Python objects for ML tensors. Key features: 1. **Logical BFloat16 Type**: Adds `DataType.bfloat16()` which behaves logically as a 16-bit brain floating point number. 2. **Float32 Physical Storage**: Uses 32-bit floats for underlying storage. This ensures zero precision loss (as BF16 is a truncated FP32) while leveraging existing vectorized Float32 kernels for high performance. 3. **Seamless Interop**: - Supports ingestion from `torch.bfloat16` tensors and `ml_dtypes.bfloat16` numpy arrays. - `to_pylist()` reconstructs `torch.bfloat16` tensors (if torch is available) or returns float32 numpy arrays, preserving type fidelity. - Integrates with `jaxtyping` for type inference. 4. **Arrow Compatibility**: Implements Arrow Extension Type (`daft.bfloat16`) for serialization and interoperability. This implementation eliminates the overhead of `DataType.Python` for BF16 data, significantly improving memory usage and processing speed for ML workloads.
3 days ago
5f4047e
huleilei:feature/bf16-unified
CodSpeed Performance Gauge
0%
feat(mcap): support topic_start_time_resolver and raw-bytes non-seekable reader
3 days ago
cb6e0b1
Jay-ju:mcap_support_callback
CodSpeed Performance Gauge
0%
perf: Only Serialize Required Cols in Actor UDFs Signed-off-by: plotor <zhenchao.wang@hotmail.com>
2 days ago
5cc556a
plotor:zhenchao-perf
CodSpeed Performance Gauge
0%
Fix Utf8Array creation in guess_mime_type implementation Co-authored-by: everettVT <145285237+everettVT@users.noreply.github.com>
2 days ago
2e8808b
copilot/add-guess-mime-type-scalar
CodSpeed Performance Gauge
0%
© 2025 CodSpeed Technology
Home Terms Privacy Docs