Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
remove slide and handle jump in avx512
This removes the extra overhead and is cycle for cycle tied with pre-avx512 decision on non avx512. The PCALIGN on avx512 improves the performance from 28.5GiB/s to 30GiB/s on 4K, and 25GiB/s to 28GiB/s on 10M. This also removes avo because it were running in my legs as I couldn't get it to jump to an other function and cespare emited reservation to using it.
- Loading branch information