Avatar for the liskajiri user
liskajiri
needle
BlogDocsChangelog

perf: Use split op to improve LSTM performance

#108Merged
Comparing
lstm_perf
(
2ebd111
) with
main
(
4c8c343
)
CodSpeed Performance Gauge
-16%
Improvements
4
Regressions
1
Untouched
7
New
0
Dropped
0
Ignored
0

Benchmarks

Improved

test_sequence_cell[cpu-backward-lstm]
benchmarks/test_sequence_cells.py::test_sequence_cell[cpu-backward-lstm]
CodSpeed Performance Gauge
×250
7,045.5 ms
28.3 ms
test_sequence_cell[cpu-forward-lstm]
benchmarks/test_sequence_cells.py::test_sequence_cell[cpu-forward-lstm]
CodSpeed Performance Gauge
×10
102.9 ms
10.3 ms
test_sequence_cell[cpu-forward-rnn]
benchmarks/test_sequence_cells.py::test_sequence_cell[cpu-forward-rnn]
CodSpeed Performance Gauge
+16%
3.3 ms
2.8 ms
test_sequence_cell[cpu-backward-rnn]
benchmarks/test_sequence_cells.py::test_sequence_cell[cpu-backward-rnn]
CodSpeed Performance Gauge
+12%
6.7 ms
6 ms

Passed

test_matmul[cpu-128-128-128]
benchmarks/test_matmul.py::test_matmul[cpu-128-128-128]
CodSpeed Performance Gauge
+2%
15.1 ms
14.8 ms
test_matmul[cpu-512-512-512]
benchmarks/test_matmul.py::test_matmul[cpu-512-512-512]
CodSpeed Performance Gauge
0%
512.2 ms
511.5 ms
test_matmul[cpu-8-8-8]
benchmarks/test_matmul.py::test_matmul[cpu-8-8-8]
CodSpeed Performance Gauge
0%
66.5 µs
66.5 µs
test_matmul[cpu-256-256-256]
benchmarks/test_matmul.py::test_matmul[cpu-256-256-256]
CodSpeed Performance Gauge
0%
80.1 ms
80.2 ms
test_conv[cpu-backward-image-like]
benchmarks/test_conv.py::test_conv[cpu-backward-image-like]
CodSpeed Performance Gauge
-1%
9.7 s
9.8 s
test_needle_import
benchmarks/test_import_time.py::test_needle_import
CodSpeed Performance Gauge
-1%
131.6 µs
133.5 µs
test_conv[cpu-forward-image-like]
benchmarks/test_conv.py::test_conv[cpu-forward-image-like]
CodSpeed Performance Gauge
-5%
1.3 s
1.3 s
test_matmul[cpu-64-64-64]Regression
benchmarks/test_matmul.py::test_matmul[cpu-64-64-64]
CodSpeed Performance Gauge
-16%
1.6 ms
1.9 ms

Commits

Click on a commit to change the comparison range
Base
main
4c8c343
-16%
perf: Use split op to improve LSTM performance
2ebd111
5 days ago
by liskajiri
© 2025 CodSpeed Technology
Home Terms Privacy Docs