by adding a `--size` flag we make the size of buffers not comptime known, which prevents certain unrolling optimizations from happening. secondly, `noinline` creates a more reproducable env for benchmarking by isolating the contents of the functions