-
Notifications
You must be signed in to change notification settings - Fork 2
Bottlenecks Bench
Philip Bedoukian edited this page Apr 9, 2021
·
3 revisions
Bottlenecks per bench.
Gemm/2mm/3mm
- Vector pipeline stalls.
Fdtd2d
- Router stalls and frame stalls --> trying to increase fetch width 8->16
Gesummv
- Frame stalls --> try to increase fetch width 8->16
Conv2d
- Scalar core too much work. Longlines mostly resolves this. But still seems like the main bottleneck
syrk/syr2k
- Not a ton of stalls, but pretty sensitive to network width. Not sure why SIMD throughput irrelevant (+12% perf) on this one.