-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sha3: add SIMD implementation with ARMv8.2 features #182
base: master
Are you sure you want to change the base?
Conversation
This PR (HEAD: e1651cc) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
Message from Go Bot: Patch Set 1: Congratulations on opening your first change. Thank you for your contribution! Next steps: Most changes in the Go project go through a few rounds of revision. This can be During May-July and Nov-Jan the Go project is in a code freeze, during which Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
e1651cc
to
417caef
Compare
This PR (HEAD: 417caef) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
417caef
to
f4b87ef
Compare
This PR (HEAD: f4b87ef) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
f4b87ef
to
e8fb822
Compare
This PR (HEAD: e8fb822) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
e8fb822
to
4383406
Compare
This PR (HEAD: 4383406) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
4383406
to
3b1cf3e
Compare
This PR (HEAD: 3b1cf3e) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
3b1cf3e
to
f993927
Compare
This PR (HEAD: f993927) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
Message from Ian Lance Taylor: Patch Set 7: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
f993927
to
a223041
Compare
This PR (HEAD: a223041) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
6a87407
to
7a572dc
Compare
This PR (HEAD: 7a572dc) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
7a572dc
to
7a6265e
Compare
This PR (HEAD: 7a6265e) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
7a6265e
to
080b167
Compare
This PR (HEAD: 080b167) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
Message from Hau Yang: Patch Set 11: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
080b167
to
48dd072
Compare
This PR (HEAD: 48dd072) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
48dd072
to
02cef39
Compare
Message from Meng Zhuo: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 11: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 14: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Meng Zhuo: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
On ARMv8 four SIMD instructions, EOR3, RAX1, XAR, BCAX are added to accelerate sha3 operations. Here the SIMD version of sha3 on ARMv8 is added. Compare to the pure golang implementation (the implementation in keccakf.go), the time difference is listed in the following benchmark old ns/op new ns/op delta BenchmarkPermutationFunction-8 227.0 ns/op 153.6 ns/op -32.33% BenchmarkSha3_512_MTU-8 4954 ns/op 3296 ns/op -33.47% BenchmarkSha3_384_MTU-8 3586 ns/op 2441 ns/op -31.93% BenchmarkSha3_256_MTU-8 2909 ns/op 1982 ns/op -31.87% BenchmarkSha3_224_MTU-8 2779 ns/op 1905 ns/op -31.45% BenchmarkShake128_MTU-8 2326 ns/op 1588 ns/op -31.73% BenchmarkShake256_MTU-8 2485 ns/op 1670 ns/op -32.80% BenchmarkShake256_16x-8 37052 ns/op 26715 ns/op -27.90% BenchmarkShake256_1MiB-8 1911863 ns/op 1293014 ns/op -32.37% BenchmarkSha3_512_1MiB-8 3496335 ns/op 2317853 ns/op -33.71% benchmark old MB/s new MB/s speedup BenchmarkPermutationFunction-8 881.22 MB/s 1302.48 MB/s 1.48x BenchmarkSha3_512_MTU-8 272.50 MB/s 409.64 MB/s 1.50x BenchmarkSha3_384_MTU-8 376.47 MB/s 553.06 MB/s 1.47x BenchmarkSha3_256_MTU-8 464.11 MB/s 681.27 MB/s 1.47x BenchmarkSha3_224_MTU-8 485.75 MB/s 708.83 MB/s 1.46x BenchmarkShake128_MTU-8 580.32 MB/s 849.97 MB/s 1.46x BenchmarkShake256_MTU-8 543.34 MB/s 808.53 MB/s 1.49x BenchmarkShake256_16x-8 442.19 MB/s 613.29 MB/s 1.39x BenchmarkShake256_1MiB-8 548.46 MB/s 810.95 MB/s 1.48x BenchmarkSha3_512_1MiB-8 299.91 MB/s 452.39 MB/s 1.51x
d9aa0f1
to
acc175a
Compare
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Ian Lance Taylor: Patch Set 20: (8 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
This PR (HEAD: acc175a) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/crypto/+/318869 to see it. Tip: You can toggle comments from me using the |
Message from Hau Yang: Patch Set 21: (6 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 22: (2 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Gopher Robot: Patch Set 1: Congratulations on opening your first change. Thank you for your contribution! Next steps: Most changes in the Go project go through a few rounds of revision. This can be During May-July and Nov-Jan the Go project is in a code freeze, during which Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Ian Lance Taylor: Patch Set 7: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 11: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 14: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from M Zhuo: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 20: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Ian Lance Taylor: Patch Set 20: (8 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 21: (6 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 22: (2 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
Message from Hau Yang: Patch Set 22: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/318869. |
sha3: add SIMD implementation with ARMv8.2 features
On ARMv8 four SIMD instructions, EOR3, RAX1, XAR, BCAX are added
to accelerate sha3 operations. Here the SIMD version of sha3
on ARMv8 is added.
Compare to the pure golang implementation (the implementation in
keccakf.go) on Apple M1 chip, the time difference is listed in the following
benchmark old ns/op new ns/op delta
BenchmarkPermutationFunction-8 227.0 ns/op 153.6 ns/op -32.33%
BenchmarkSha3_512_MTU-8 4954 ns/op 3296 ns/op -33.47%
BenchmarkSha3_384_MTU-8 3586 ns/op 2441 ns/op -31.93%
BenchmarkSha3_256_MTU-8 2909 ns/op 1982 ns/op -31.87%
BenchmarkSha3_224_MTU-8 2779 ns/op 1905 ns/op -31.45%
BenchmarkShake128_MTU-8 2326 ns/op 1588 ns/op -31.73%
BenchmarkShake256_MTU-8 2485 ns/op 1670 ns/op -32.80%
BenchmarkShake256_16x-8 37052 ns/op 26715 ns/op -27.90%
BenchmarkShake256_1MiB-8 1911863 ns/op 1293014 ns/op -32.37%
BenchmarkSha3_512_1MiB-8 3496335 ns/op 2317853 ns/op -33.71%
benchmark old MB/s new MB/s speedup
BenchmarkPermutationFunction-8 881.22 MB/s 1302.48 MB/s 1.48x
BenchmarkSha3_512_MTU-8 272.50 MB/s 409.64 MB/s 1.50x
BenchmarkSha3_384_MTU-8 376.47 MB/s 553.06 MB/s 1.47x
BenchmarkSha3_256_MTU-8 464.11 MB/s 681.27 MB/s 1.47x
BenchmarkSha3_224_MTU-8 485.75 MB/s 708.83 MB/s 1.46x
BenchmarkShake128_MTU-8 580.32 MB/s 849.97 MB/s 1.46x
BenchmarkShake256_MTU-8 543.34 MB/s 808.53 MB/s 1.49x
BenchmarkShake256_16x-8 442.19 MB/s 613.29 MB/s 1.39x
BenchmarkShake256_1MiB-8 548.46 MB/s 810.95 MB/s 1.48x
BenchmarkSha3_512_1MiB-8 299.91 MB/s 452.39 MB/s 1.51x
Result from 'benchstat'
name old time/op new time/op delta
PermutationFunction-8 306ns ± 2% 153ns ± 0% -49.98% (p=0.000 n=10+10)
name old speed new speed delta
PermutationFunction-8 654MB/s ± 2% 1308MB/s ± 0% +99.91% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
PermutationFunction-8 0.00B 0.00B ~ (all equal)
name old allocs/op new allocs/op delta
PermutationFunction-8 0.00 0.00 ~ (all equal)
Change-Id: I7e24f5c44ef96e1301190db6a21add407ce13af0
GitHub-Last-Rev: acc175a
GitHub-Pull-Request: #182