Skip to content

Commit c985c18

Browse files
committed
Update README to reflect AVX2 support for crypto/sha256
1 parent 43ed500 commit c985c18

File tree

1 file changed

+2
-18
lines changed

1 file changed

+2
-18
lines changed

README.md

Lines changed: 2 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
Accelerate SHA256 computations in pure Go for both Intel (AVX2, AVX, SSE) as well as ARM (arm64) platforms.
44

5+
Update: As of Go 1.8, `crypto/sha256` offers similar performance for AVX2.
6+
57
## Introduction
68

79
This package is designed as a drop-in replacement for `crypto/sha256`. For Intel CPUs it has three flavors for AVX2, AVX and SSE whereby the fastest method is automatically chosen depending on CPU capabilities. For ARM CPUs with the Cryptography Extensions advantage is taken of the SHA2 instructions resulting in a massive performance improvement.
@@ -46,7 +48,6 @@ Below is the speed in MB/s for a single core (ranked fast to slow) as well as th
4648
| 2.4 GHz Intel Xeon CPU E5-2620 v3 | minio/sha256-simd (AVX2) | 355.0 MB/s | 1.88x |
4749
| 2.4 GHz Intel Xeon CPU E5-2620 v3 | minio/sha256-simd (AVX) | 306.0 MB/s | 1.62x |
4850
| 2.4 GHz Intel Xeon CPU E5-2620 v3 | minio/sha256-simd (SSE) | 298.7 MB/s | 1.58x |
49-
| 2.4 GHz Intel Xeon CPU E5-2620 v3 | crypto/sha256 | 189.2 MB/s | |
5051
| 1.2 GHz ARM Cortex-A53 | crypto/sha256 | 6.1 MB/s | |
5152

5253
Note that the AVX2 version is measured with the "unrolled"/"demacro-ed" version. Due to some Golang assembly restrictions the AVX2 version that uses `defines` loses about 15% performance (you can see the macrofied version, which is a little bit easier to read, [here](https://github.com/minio/sha256-simd/blob/e1b0a493b71bb31e3f1bf82d3b8cbd0d6960dfa6/sha256blockAvx2_amd64.s)).
@@ -119,23 +120,6 @@ BenchmarkHash1M-4 6.05 638.23 105.49x
119120

120121
Example performance metrics were generated on Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz - 6 physical cores, 12 logical cores running Ubuntu GNU/Linux with kernel version 4.4.0-24-generic (vanilla with no optimizations).
121122

122-
### AVX2
123-
124-
```
125-
$ benchcmp go.txt avx2.txt
126-
benchmark old ns/op new ns/op delta
127-
BenchmarkHash8Bytes-12 446 364 -18.39%
128-
BenchmarkHash1K-12 5919 3279 -44.60%
129-
BenchmarkHash8K-12 43791 23655 -45.98%
130-
BenchmarkHash1M-12 5544989 2969305 -46.45%
131-
132-
benchmark old MB/s new MB/s speedup
133-
BenchmarkHash8Bytes-12 17.93 21.96 1.22x
134-
BenchmarkHash1K-12 172.98 312.27 1.81x
135-
BenchmarkHash8K-12 187.07 346.31 1.85x
136-
BenchmarkHash1M-12 189.10 353.14 1.87x
137-
```
138-
139123
### AVX
140124

141125
```

0 commit comments

Comments
 (0)