Avatar for the jvdd user
jvdd
argminmax
BlogDocsChangelog

:construction: POC - support NaNs for SSE & AVX2 f32

#18Closed
Comparing
NaNs
(
df40466
) with
main
(
684ade2
)
CodSpeed Performance Gauge
-16%
Improvements
0
Regressions
5
Untouched
39
New
0
Dropped
0
Ignored
0

Benchmarks

Failed

avx_random_long_f32Regression
benches/bench_f32.rs::benches::minmax_f32_random_array_long::avx_random_long_f32
CodSpeed Performance Gauge
-5%
224.9 µs
235.6 µs
impl_random_long_f32Regression
benches/bench_f32.rs::benches::minmax_f32_random_array_long::impl_random_long_f32
CodSpeed Performance Gauge
-5%
225.1 µs
235.8 µs
scalar_random_long_f32Regression
benches/bench_f32.rs::benches::minmax_f32_random_array_long::scalar_random_long_f32
CodSpeed Performance Gauge
-16%
442.9 µs
528.2 µs
sse_random_long_f32Regression
benches/bench_f32.rs::benches::minmax_f32_random_array_long::sse_random_long_f32
CodSpeed Performance Gauge
-12%
276.3 µs
315.5 µs
scalar_random_long_f64Regression
benches/bench_f64.rs::benches::minmax_f64_random_array_long::scalar_random_long_f64
CodSpeed Performance Gauge
-12%
629.6 µs
714.8 µs

Passed

sse_random_long_i32
benches/bench_i32.rs::benches::minmax_i32_random_array_long::sse_random_long_i32
CodSpeed Performance Gauge
+1%
290.5 µs
286.9 µs
sse_random_long_u32
benches/bench_u32.rs::benches::minmax_u32_random_array_long::sse_random_long_u32
CodSpeed Performance Gauge
+1%
301.2 µs
297.6 µs
impl_random_long_u64
benches/bench_u64.rs::benches::minmax_u64_random_array_long::impl_random_long_u64
CodSpeed Performance Gauge
+1%
459.6 µs
456 µs
impl_random_long_i64
benches/bench_i64.rs::benches::minmax_i64_random_array_long::impl_random_long_i64
CodSpeed Performance Gauge
+1%
459.6 µs
456 µs
avx2_random_long_u64
benches/bench_u64.rs::benches::minmax_u64_random_array_long::avx2_random_long_u64
CodSpeed Performance Gauge
+1%
459.4 µs
455.8 µs
avx2_random_long_i64
benches/bench_i64.rs::benches::minmax_i64_random_array_long::avx2_random_long_i64
CodSpeed Performance Gauge
+1%
459.3 µs
455.8 µs
impl_random_long_f16
benches/bench_f16.rs::benches::minmax_f16_random_array_long::impl_random_long_f16
CodSpeed Performance Gauge
0%
119.2 µs
118.9 µs
impl_random_long_u8
benches/bench_u8.rs::benches::minmax_u8_random_array_long::impl_random_long_u8
CodSpeed Performance Gauge
0%
86.2 µs
86 µs
avx2_random_long_f16
benches/bench_f16.rs::benches::minmax_f16_random_array_long::avx2_random_long_f16
CodSpeed Performance Gauge
0%
119 µs
118.7 µs
avx2_random_long_i8
benches/bench_i8.rs::benches::minmax_i8_random_array_long::avx2_random_long_i8
CodSpeed Performance Gauge
0%
74.2 µs
74.1 µs
impl_random_long_i16
benches/bench_i16.rs::benches::minmax_i16_random_array_long::impl_random_long_i16
CodSpeed Performance Gauge
0%
113.3 µs
113.1 µs
impl_random_long_u16
benches/bench_u16.rs::benches::minmax_u16_random_array_long::impl_random_long_u16
CodSpeed Performance Gauge
0%
113.3 µs
113.1 µs
impl_random_long_i8
benches/bench_i8.rs::benches::minmax_i8_random_array_long::impl_random_long_i8
CodSpeed Performance Gauge
0%
85.3 µs
85.2 µs
avx2_random_long_u8
benches/bench_u8.rs::benches::minmax_u8_random_array_long::avx2_random_long_u8
CodSpeed Performance Gauge
0%
73.6 µs
73.6 µs
sse_random_long_f16
benches/bench_f16.rs::benches::minmax_f16_random_array_long::sse_random_long_f16
CodSpeed Performance Gauge
0%
158.9 µs
158.7 µs
sse_random_long_u8
benches/bench_u8.rs::benches::minmax_u8_random_array_long::sse_random_long_u8
CodSpeed Performance Gauge
0%
86 µs
85.9 µs
avx2_random_long_u16
benches/bench_u16.rs::benches::minmax_u16_random_array_long::avx2_random_long_u16
CodSpeed Performance Gauge
0%
113.1 µs
113 µs
impl_random_long_u32
benches/bench_u32.rs::benches::minmax_u32_random_array_long::impl_random_long_u32
CodSpeed Performance Gauge
0%
225.1 µs
225 µs
scalar_random_long_i16
benches/bench_i16.rs::benches::minmax_i16_random_array_long::scalar_random_long_i16
CodSpeed Performance Gauge
0%
349.6 µs
349.5 µs
sse_random_long_i8
benches/bench_i8.rs::benches::minmax_i8_random_array_long::sse_random_long_i8
CodSpeed Performance Gauge
0%
85.1 µs
85.1 µs
impl_random_long_i32
benches/bench_i32.rs::benches::minmax_i32_random_array_long::impl_random_long_i32
CodSpeed Performance Gauge
0%
225 µs
224.9 µs
avx2_random_long_i16
benches/bench_i16.rs::benches::minmax_i16_random_array_long::avx2_random_long_i16
CodSpeed Performance Gauge
0%
113 µs
113 µs
scalar_random_long_u32
benches/bench_u32.rs::benches::minmax_u32_random_array_long::scalar_random_long_u32
CodSpeed Performance Gauge
0%
442.9 µs
442.8 µs
scalar_random_long_f16
benches/bench_f16.rs::benches::minmax_f16_random_array_long::scalar_random_long_f16
CodSpeed Performance Gauge
0%
463.4 µs
463.3 µs
avx2_random_long_u32
benches/bench_u32.rs::benches::minmax_u32_random_array_long::avx2_random_long_u32
CodSpeed Performance Gauge
0%
224.9 µs
224.9 µs
scalar_random_long_u16
benches/bench_u16.rs::benches::minmax_u16_random_array_long::scalar_random_long_u16
CodSpeed Performance Gauge
0%
349.5 µs
349.5 µs
scalar_random_long_u8
benches/bench_u8.rs::benches::minmax_u8_random_array_long::scalar_random_long_u8
CodSpeed Performance Gauge
0%
359.8 µs
359.7 µs
scalar_random_long_i32
benches/bench_i32.rs::benches::minmax_i32_random_array_long::scalar_random_long_i32
CodSpeed Performance Gauge
0%
442.9 µs
442.8 µs
scalar_random_long_i64
benches/bench_i64.rs::benches::minmax_i64_random_array_long::scalar_random_long_i64
CodSpeed Performance Gauge
0%
629.5 µs
629.5 µs
scalar_random_long_u64
benches/bench_u64.rs::benches::minmax_u64_random_array_long::scalar_random_long_u64
CodSpeed Performance Gauge
0%
629.5 µs
629.5 µs
scalar_random_long_i8
benches/bench_i8.rs::benches::minmax_i8_random_array_long::scalar_random_long_i8
CodSpeed Performance Gauge
0%
359.7 µs
359.7 µs
avx2_random_long_i32
benches/bench_i32.rs::benches::minmax_i32_random_array_long::avx2_random_long_i32
CodSpeed Performance Gauge
0%
224.9 µs
224.9 µs
sse_random_long_u64
benches/bench_u64.rs::benches::minmax_u64_random_array_long::sse_random_long_u64
CodSpeed Performance Gauge
0%
622.9 µs
622.9 µs
sse_random_long_f64
benches/bench_f64.rs::benches::minmax_f64_random_array_long::sse_random_long_f64
CodSpeed Performance Gauge
0%
551.8 µs
551.8 µs
impl_random_long_f64
benches/bench_f64.rs::benches::minmax_f64_random_array_long::impl_random_long_f64
CodSpeed Performance Gauge
0%
449 µs
449 µs
sse_random_long_i64
benches/bench_i64.rs::benches::minmax_i64_random_array_long::sse_random_long_i64
CodSpeed Performance Gauge
0%
608.6 µs
608.6 µs
avx_random_long_f64
benches/bench_f64.rs::benches::minmax_f64_random_array_long::avx_random_long_f64
CodSpeed Performance Gauge
0%
448.8 µs
448.8 µs
sse_random_long_i16
benches/bench_i16.rs::benches::minmax_i16_random_array_long::sse_random_long_i16
CodSpeed Performance Gauge
0%
144.1 µs
144.1 µs
sse_random_long_u16
benches/bench_u16.rs::benches::minmax_u16_random_array_long::sse_random_long_u16
CodSpeed Performance Gauge
0%
149.4 µs
149.5 µs

Commits

Click on a commit to change the comparison range
Base
main
684ade2
-13%
:construction: partially supported NaNs for SSE f32
6ffb4db
2 years ago
by jvdd
-3%
:broom: formatting
7ac2504
2 years ago
by jvdd
0%
:mag: enable lto
bf1725b
2 years ago
by jvdd
0%
:see_no_evil: check for avx2 instead of avx
08588bb
2 years ago
by jvdd
0%
:bug: check for avx2 instead of avx in benches
2bd0f8e
2 years ago
by jvdd
0%
:see_no_evil: remove avx from runtime feature detection checks
faa3f56
2 years ago
by jvdd
0%
:see_no_evil:
50ee420
2 years ago
by jvdd
0%
:pen: undo rename so that codspeed can compare
4f9b767
2 years ago
by jvdd
-34%
:bug: load unaligned
bae6139
2 years ago
by jvdd
+34%
:zap: use faster SIMD intrinsics
df40466
2 years ago
by jvdd
Home Terms PrivacyDocs