2) Overheads & Assumptions
Covers kernels, framework overhead, scheduling, etc. (1.5–3× typical).
Real workloads often hit 30–70% of peak due to memory/bandwidth limits.
We’ll show core results in TFLOPs; the TOPS figure is a rough INT8 equivalence.
Multiply requirements if you need multiple simultaneous inferences.