From 7ba752d826a90d0a0f95b38af033c0dc4d6d9356 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?David=20Test=C3=A9?= Date: Wed, 19 Feb 2025 16:48:26 +0100 Subject: [PATCH] docs: refactor and update benchmarks pages Benchmarks tables are rendered as descriptive SVG images. Sort results by backend to have a clearer view in tree of content. PBS benchmarks now display results for various p-fail and several precisions. --- tfhe/docs/SUMMARY.md | 7 +- ...er_benchmark_tuniform_2m128_ciphertext.svg | 230 ++++++++++++++++++ ...ger_benchmark_tuniform_2m128_plaintext.svg | 196 +++++++++++++++ ...ger_benchmark_tuniform_2m64_ciphertext.svg | 230 ++++++++++++++++++ ...eger_benchmark_tuniform_2m64_plaintext.svg | 196 +++++++++++++++ .../cpu_integer_operations.md} | 34 ++- .../cpu/cpu_pbs_benchmark_tuniform_2m128.svg | 42 ++++ .../cpu/cpu_pbs_benchmark_tuniform_2m40.svg | 64 +++++ .../cpu/cpu_pbs_benchmark_tuniform_2m64.svg | 64 +++++ .../cpu/cpu_programmable_bootstraping.md | 42 ++++ .../getting_started/benchmarks/cpu/summary.md | 12 + ...ark_fheuint64_tuniform_2m64_ciphertext.svg | 100 ++++++++ ...0x1_multi_bit_tuniform_2m64_ciphertext.svg | 230 ++++++++++++++++++ ...00x1_multi_bit_tuniform_2m64_plaintext.svg | 162 ++++++++++++ .../gpu_integer_operations.md} | 25 +- .../getting_started/benchmarks/gpu/summary.md | 11 + .../getting_started/benchmarks/summary.md | 2 +- 17 files changed, 1616 insertions(+), 31 deletions(-) create mode 100644 tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m128_ciphertext.svg create mode 100644 tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m128_plaintext.svg create mode 100644 tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m64_ciphertext.svg create mode 100644 tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m64_plaintext.svg rename tfhe/docs/getting_started/benchmarks/{cpu_benchmarks.md => cpu/cpu_integer_operations.md} (65%) create mode 100644 tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m128.svg create mode 100644 tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m40.svg create mode 100644 tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m64.svg create mode 100644 tfhe/docs/getting_started/benchmarks/cpu/cpu_programmable_bootstraping.md create mode 100644 tfhe/docs/getting_started/benchmarks/cpu/summary.md create mode 100644 tfhe/docs/getting_started/benchmarks/cpu_gpu_integer_benchmark_fheuint64_tuniform_2m64_ciphertext.svg create mode 100644 tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_ciphertext.svg create mode 100644 tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_plaintext.svg rename tfhe/docs/getting_started/benchmarks/{gpu_benchmarks.md => gpu/gpu_integer_operations.md} (52%) create mode 100644 tfhe/docs/getting_started/benchmarks/gpu/summary.md diff --git a/tfhe/docs/SUMMARY.md b/tfhe/docs/SUMMARY.md index 92a92ddb08..32e0b1f037 100644 --- a/tfhe/docs/SUMMARY.md +++ b/tfhe/docs/SUMMARY.md @@ -9,8 +9,11 @@ * [Quick start](getting\_started/quick\_start.md) * [Types & Operations](getting\_started/operations.md) * [Benchmarks](getting\_started/benchmarks/summary.md) - * [CPU Benchmarks](getting\_started/benchmarks/cpu\_benchmarks.md) - * [GPU Benchmarks](getting\_started/benchmarks/gpu\_benchmarks.md) + * [CPU Benchmarks](getting\_started/benchmarks/cpu/summary.md) + * [Integer](getting\_started/benchmarks/cpu/cpu\_integer\_operations.md) + * [Programmable bootstrapping](getting\_started/benchmarks/cpu/cpu\_programmable\_bootstraping.md) + * [GPU Benchmarks](getting\_started/benchmarks/gpu/summary.md) + * [Integer](getting\_started/benchmarks/gpu/gpu\_integer\_operations.md) * [Zero-knowledge proof benchmarks](getting_started/benchmarks/zk_proof_benchmarks.md) * [Security and cryptography](getting\_started/security\_and\_cryptography.md) diff --git a/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m128_ciphertext.svg b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m128_ciphertext.svg new file mode 100644 index 0000000000..eb889cc316 --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m128_ciphertext.svg @@ -0,0 +1,230 @@ + + + + Operation \ Size + FheUint4 + FheUint8 + FheUint16 + FheUint32 + FheUint64 + FheUint128 + FheUint256 + + + + Negation (`-`) + + 33.4 ms + + 48.9 ms + + 57.4 ms + + 79.7 ms + + 105 ms + + 159 ms + + 183 ms + + Add / Sub (`+`,`-`) + + 33.5 ms + + 53.5 ms + + 59.8 ms + + 82.1 ms + + 109 ms + + 165 ms + + 187 ms + + Mul (`x`) + + 39.7 ms + + 97.5 ms + + 141 ms + + 213 ms + + 400 ms + + 1.14 s + + 3.79 s + + Equal / Not Equal (`eq`, `ne`) + + 34.3 ms + + 36.1 ms + + 56.3 ms + + 56.9 ms + + 81.4 ms + + 82.0 ms + + 104 ms + + Comparisons (`ge`, `gt`, `le`, `lt`) + + 37.4 ms + + 37.1 ms + + 54.8 ms + + 76.7 ms + + 99.0 ms + + 145 ms + + 175 ms + + Max / Min (`max`,`min`) + + 75.6 ms + + 76.9 ms + + 97.6 ms + + 121 ms + + 148 ms + + 194 ms + + 244 ms + + Bitwise operations (`&`, `|`, `^`) + + 20.2 ms + + 18.7 ms + + 19.7 ms + + 20.6 ms + + 22.9 ms + + 23.8 ms + + 26.3 ms + + Div / Rem (`/`, `%`) + + 295 ms + + 644 ms + + 1.49 s + + 3.44 s + + 8.49 s + + 20.9 s + + 54.6 s + + Left / Right Shifts (`<<`, `>>`) + + 34.7 ms + + 58.8 ms + + 81.9 ms + + 107 ms + + 142 ms + + 178 ms + + 248 ms + + Left / Right Rotations (`left_rotate`, `right_rotate`) + + 39.7 ms + + 59.7 ms + + 81.4 ms + + 107 ms + + 142 ms + + 186 ms + + 249 ms + + Leading / Trailing zeros/ones + + 77.1 ms + + 95.7 ms + + 159 ms + + 182 ms + + 255 ms + + 304 ms + + 345 ms + + Log2 + + 90.4 ms + + 114 ms + + 173 ms + + 199 ms + + 280 ms + + 327 ms + + 369 ms + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m128_plaintext.svg b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m128_plaintext.svg new file mode 100644 index 0000000000..5f47350680 --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m128_plaintext.svg @@ -0,0 +1,196 @@ + + + + Operation \ Size + FheUint4 + FheUint8 + FheUint16 + FheUint32 + FheUint64 + FheUint128 + FheUint256 + + + + Add / Sub (`+`,`-`) + + 33.5 ms + + 52.5 ms + + 60.6 ms + + 64.2 ms + + 89.3 ms + + 111 ms + + 181 ms + + Mul (`x`) + + 36.2 ms + + 74.8 ms + + 125 ms + + 175 ms + + 242 ms + + 453 ms + + 1.11 s + + Equal / Not Equal (`eq`, `ne`) + + 18.8 ms + + 32.2 ms + + 35.2 ms + + 55.2 ms + + 58.2 ms + + 79.0 ms + + 81.2 ms + + Comparisons (`ge`, `gt`, `le`, `lt`) + + 15.1 ms + + 37.2 ms + + 36.6 ms + + 56.0 ms + + 78.9 ms + + 101 ms + + 145 ms + + Max / Min (`max`,`min`) + + 32.9 ms + + 52.6 ms + + 57.0 ms + + 78.1 ms + + 103 ms + + 123 ms + + 171 ms + + Bitwise operations (`&`, `|`, `^`) + + 18.0 ms + + 18.9 ms + + 19.6 ms + + 21.4 ms + + 23.1 ms + + 24.1 ms + + 26.7 ms + + Div (`/`) + + 81.1 ms + + 139 ms + + 202 ms + + 280 ms + + 456 ms + + 912 ms + + 2.33 s + + Rem (`%`) + + 154 ms + + 275 ms + + 366 ms + + 536 ms + + 778 ms + + 1.37 s + + 3.13 s + + Left / Right Shifts (`<<`, `>>`) + + 22.3 ms + + 20.0 ms + + 20.3 ms + + 21.2 ms + + 23.2 ms + + 24.2 ms + + 26.5 ms + + Left / Right Rotations (`left_rotate`, `right_rotate`) + + 20.2 ms + + 19.5 ms + + 20.6 ms + + 21.1 ms + + 23.6 ms + + 24.1 ms + + 26.2 ms + + + + + + + + + + + + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m64_ciphertext.svg b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m64_ciphertext.svg new file mode 100644 index 0000000000..dbacb6bd2b --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m64_ciphertext.svg @@ -0,0 +1,230 @@ + + + + Operation \ Size + FheUint4 + FheUint8 + FheUint16 + FheUint32 + FheUint64 + FheUint128 + FheUint256 + + + + Negation (`-`) + + 33.1 ms + + 48.7 ms + + 57.6 ms + + 81.2 ms + + 106 ms + + 168 ms + + 189 ms + + Add / Sub (`+`,`-`) + + 38.1 ms + + 59.9 ms + + 60.2 ms + + 82.2 ms + + 105 ms + + 168 ms + + 182 ms + + Mul (`x`) + + 40.6 ms + + 103 ms + + 143 ms + + 219 ms + + 401 ms + + 1.15 s + + 3.84 s + + Equal / Not Equal (`eq`, `ne`) + + 36.5 ms + + 37.1 ms + + 58.3 ms + + 59.0 ms + + 81.2 ms + + 82.3 ms + + 106 ms + + Comparisons (`ge`, `gt`, `le`, `lt`) + + 36.3 ms + + 37.4 ms + + 57.2 ms + + 80.1 ms + + 102 ms + + 145 ms + + 175 ms + + Max / Min (`max`,`min`) + + 79.4 ms + + 79.8 ms + + 99.8 ms + + 122 ms + + 145 ms + + 192 ms + + 246 ms + + Bitwise operations (`&`, `|`, `^`) + + 19.8 ms + + 19.4 ms + + 19.6 ms + + 20.5 ms + + 20.7 ms + + 23.3 ms + + 26.0 ms + + Div / Rem (`/`, `%`) + + 291 ms + + 693 ms + + 1.56 s + + 3.52 s + + 8.22 s + + 21.1 s + + 55.2 s + + Left / Right Shifts (`<<`, `>>`) + + 38.5 ms + + 61.2 ms + + 84.3 ms + + 109 ms + + 134 ms + + 174 ms + + 250 ms + + Left / Right Rotations (`left_rotate`, `right_rotate`) + + 40.4 ms + + 61.4 ms + + 82.6 ms + + 105 ms + + 133 ms + + 184 ms + + 259 ms + + Leading / Trailing zeros/ones + + 80.5 ms + + 100 ms + + 156 ms + + 183 ms + + 247 ms + + 298 ms + + 347 ms + + Log2 + + 100 ms + + 121 ms + + 182 ms + + 205 ms + + 267 ms + + 323 ms + + 369 ms + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m64_plaintext.svg b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m64_plaintext.svg new file mode 100644 index 0000000000..71466ed7cc --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_benchmark_tuniform_2m64_plaintext.svg @@ -0,0 +1,196 @@ + + + + Operation \ Size + FheUint4 + FheUint8 + FheUint16 + FheUint32 + FheUint64 + FheUint128 + FheUint256 + + + + Add / Sub (`+`,`-`) + + 39.8 ms + + 56.3 ms + + 61.5 ms + + 63.8 ms + + 88.4 ms + + 111 ms + + 178 ms + + Mul (`x`) + + 40.9 ms + + 80.3 ms + + 128 ms + + 173 ms + + 231 ms + + 452 ms + + 1.11 s + + Equal / Not Equal (`eq`, `ne`) + + 19.0 ms + + 38.6 ms + + 37.8 ms + + 58.5 ms + + 58.8 ms + + 81.7 ms + + 84.2 ms + + Comparisons (`ge`, `gt`, `le`, `lt`) + + 15.3 ms + + 40.9 ms + + 39.9 ms + + 57.6 ms + + 81.0 ms + + 103 ms + + 149 ms + + Max / Min (`max`,`min`) + + 32.9 ms + + 59.1 ms + + 60.0 ms + + 81.6 ms + + 103 ms + + 127 ms + + 175 ms + + Bitwise operations (`&`, `|`, `^`) + + 19.0 ms + + 19.5 ms + + 20.5 ms + + 21.0 ms + + 22.4 ms + + 23.9 ms + + 26.3 ms + + Div (`/`) + + 81.7 ms + + 149 ms + + 188 ms + + 281 ms + + 453 ms + + 844 ms + + 2.45 s + + Rem (`%`) + + 165 ms + + 278 ms + + 360 ms + + 503 ms + + 806 ms + + 1.32 s + + 2.98 s + + Left / Right Shifts (`<<`, `>>`) + + 18.8 ms + + 20.4 ms + + 20.4 ms + + 20.9 ms + + 21.8 ms + + 23.1 ms + + 26.2 ms + + Left / Right Rotations (`left_rotate`, `right_rotate`) + + 21.0 ms + + 20.2 ms + + 20.5 ms + + 21.0 ms + + 21.7 ms + + 23.0 ms + + 26.0 ms + + + + + + + + + + + + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_operations.md similarity index 65% rename from tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md rename to tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_operations.md index 0b675c28a9..814ba1ccd0 100644 --- a/tfhe/docs/getting_started/benchmarks/cpu_benchmarks.md +++ b/tfhe/docs/getting_started/benchmarks/cpu/cpu_integer_operations.md @@ -1,36 +1,38 @@ -# CPU Benchmarks +# Integer Operations over CPU -This document details the CPU performance benchmarks of homomorphic operations using **TFHE-rs**. +This document details the CPU performance benchmarks of homomorphic operations on integers using **TFHE-rs**. -By their nature, homomorphic operations run slower than their cleartext equivalents. The following are the timings for basic operations, including benchmarks from other libraries for comparison. +By their nature, homomorphic operations run slower than their cleartext equivalents. {% hint style="info" %} All CPU benchmarks were launched on an `AWS hpc7a.96xlarge` instance equipped with an `AMD EPYC 9R14 CPU @ 2.60GHz` and 740GB of RAM. {% endhint %} -## Integer operations - The following tables benchmark the execution time of some operation sets using `FheUint` (unsigned integers). The `FheInt` (signed integers) performs similarly. +## Pfail: $2^{-64}$ + The next table shows the operation timings on CPU when all inputs are encrypted -{% embed url="https://docs.google.com/spreadsheets/d/1b_-72ArnSdaqfr-gJOnMmVdcBokYZohnylO4LUj2PMw/edit?usp=sharing" %} +![Sweet table](./cpu_integer_benchmark_tuniform_2m64_ciphertext.svg) The next table shows the operation timings on CPU when the left input is encrypted and the right is a clear scalar of the same size: -{% embed url="https://docs.google.com/spreadsheets/d/1m3tjCi_2GSIHop2zZLAtVbhdDn5wqTGd2lOA3CcJe-U/edit?usp=sharing" %} +![Sweet table](./cpu_integer_benchmark_tuniform_2m64_plaintext.svg) -All timings are based on parallelized Radix-based integer operations where each block is encrypted using the default parameters `PARAM_MESSAGE_2_CARRY_2_KS_PBS`. To ensure predictable timings, we perform operations in the `default` mode, which ensures that the input and output encoding are similar (i.e., the carries are always emptied). +## Pfail: $2^{-128}$ -You can minimize operational costs by selecting from 'unchecked', 'checked', or 'smart' modes from [the fine-grained APIs](../../references/fine-grained-apis/quick\_start.md), each balancing performance and correctness differently. For more details about parameters, see [here](../../references/fine-grained-apis/shortint/parameters.md). You can find the benchmark results on GPU for all these operations [here](../../guides/run\_on\_gpu.md#benchmarks). +The next table shows the operation timings on CPU when all inputs are encrypted -## Programmable bootstrapping +![Sweet table](./cpu_integer_benchmark_tuniform_2m128_ciphertext.svg) -The next table shows the execution time of a keyswitch followed by a programmable bootstrapping depending on the precision of the input message. The associated parameter set is given. The configuration is Concrete FFT + AVX-512. +The next table shows the operation timings on CPU when the left input is encrypted and the right is a clear scalar of the same size: -Note that these benchmarks use Gaussian parameters. +![Sweet table](./cpu_integer_benchmark_tuniform_2m128_plaintext.svg) -{% embed url="https://docs.google.com/spreadsheets/d/1o6MWpbzbYhDs3Pnoq-2hlNEgO9G8wGR5niW-OOZ6c_4/edit?usp=sharing" %} +All timings are based on parallelized Radix-based integer operations where each block is encrypted using the default parameters `PARAM_MESSAGE_2_CARRY_2_KS_PBS`. To ensure predictable timings, we perform operations in the `default` mode, which ensures that the input and output encoding are similar (i.e., the carries are always emptied). + +You can minimize operational costs by selecting from 'unchecked', 'checked', or 'smart' modes from [the fine-grained APIs](../../references/fine-grained-apis/quick\_start.md), each balancing performance and correctness differently. For more details about parameters, see [here](../../references/fine-grained-apis/shortint/parameters.md). You can find the benchmark results on GPU for all these operations [here](../../guides/run\_on\_gpu.md#benchmarks). ## Reproducing TFHE-rs benchmarks @@ -43,12 +45,6 @@ AVX512 is now enabled by default for benchmarks when available The following example shows how to reproduce **TFHE-rs** benchmarks: ```shell -#Boolean benchmarks: -make bench_boolean - #Integer benchmarks: make bench_integer - -#Shortint benchmarks: -make bench_shortint ``` diff --git a/tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m128.svg b/tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m128.svg new file mode 100644 index 0000000000..8168935ecb --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m128.svg @@ -0,0 +1,42 @@ + + + + Operation \ Precision(bits) + 2 + 4 + 6 + 8 + + + + PBS + + 9.61 ms + + 12.4 ms + + 111 ms + + 1.5 s + + KS-PBS + + 10.9 ms + + 14.6 ms + + 129 ms + + 1.44 s + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m40.svg b/tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m40.svg new file mode 100644 index 0000000000..3d3e7952b6 --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m40.svg @@ -0,0 +1,64 @@ + + + + Operation \ Precision(bits) + 2 + 4 + 6 + 8 + + + + PBS + + 6.04 ms + + 11.3 ms + + 98.7 ms + + 458 ms + + MB-PBS + + 3.41 ms + + 4.58 ms + + 37.1 ms + + 156 ms + + KS-PBS + + 7.79 ms + + 14.2 ms + + 114 ms + + 536 ms + + KS-MB-PBS + + 5.53 ms + + 7.51 ms + + 47.8 ms + + 244 ms + + + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m64.svg b/tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m64.svg new file mode 100644 index 0000000000..c5a84f0d66 --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu/cpu_pbs_benchmark_tuniform_2m64.svg @@ -0,0 +1,64 @@ + + + + Operation \ Precision(bits) + 2 + 4 + 6 + 8 + + + + PBS + + 8.9 ms + + 11.8 ms + + 102 ms + + 649 ms + + MB-PBS + + 4.4 ms + + 4.43 ms + + 27.5 ms + + 248 ms + + KS-PBS + + 11.0 ms + + 14.9 ms + + 118 ms + + 873 ms + + KS-MB-PBS + + 5.97 ms + + 7.55 ms + + 44.0 ms + + 453 ms + + + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/cpu/cpu_programmable_bootstraping.md b/tfhe/docs/getting_started/benchmarks/cpu/cpu_programmable_bootstraping.md new file mode 100644 index 0000000000..3db44a3aed --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu/cpu_programmable_bootstraping.md @@ -0,0 +1,42 @@ +# Programmable bootstrapping over CPU + +This document details the CPU performance benchmarks of programmable bootstrapping and keyswitch operations using **TFHE-rs**. + +{% hint style="info" %} +All CPU benchmarks were launched on an `AWS hpc7a.96xlarge` instance equipped with an `AMD EPYC 9R14 CPU @ 2.60GHz` and 740GB of RAM. +{% endhint %} + +The next tables show the execution time of a single programmable bootstrapping as well as keyswitch followed by a programmable bootstrapping depending on the precision of the input message. The associated parameters set are given. The configuration is tfhe-fft + AVX-512. + +Note that these benchmarks use Gaussian parameters. `MB-PBS` stands for multi-bit programmable bootstrapping. + + +## P-fail: $2^{-40}$ + +![Sweet table](./cpu_pbs_benchmark_tuniform_2m40.svg) + +## P-fail: $2^{-64}$ + +![Sweet table](./cpu_pbs_benchmark_tuniform_2m64.svg) + +## P-fail: $2^{-128}$ + +![Sweet table](./cpu_pbs_benchmark_tuniform_2m128.svg) + +## Reproducing TFHE-rs benchmarks + +**TFHE-rs** benchmarks can be easily reproduced from the [source](https://github.com/zama-ai/tfhe-rs). + +{% hint style="info" %} +AVX512 is now enabled by default for benchmarks when available +{% endhint %} + +The following example shows how to reproduce **TFHE-rs** benchmarks: + +```shell +#PBS benchmarks: +make bench_pbs + +#KS-PBS benchmarks: +make bench_ks_pbs +``` diff --git a/tfhe/docs/getting_started/benchmarks/cpu/summary.md b/tfhe/docs/getting_started/benchmarks/cpu/summary.md new file mode 100644 index 0000000000..065339600c --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu/summary.md @@ -0,0 +1,12 @@ +# Benchmarks over CPU + +This document details the CPU performance benchmarks of homomorphic operations using **TFHE-rs**. + +By their nature, homomorphic operations run slower than their cleartext equivalents. + +{% hint style="info" %} +All CPU benchmarks were launched on an `AWS hpc7a.96xlarge` instance equipped with an `AMD EPYC 9R14 CPU @ 2.60GHz` and 740GB of RAM. +{% endhint %} + +* [Integer operations](cpu\_integer\_operations.md) +* [Programmable Boostraping](cpu\_programmable\_bootstraping.md) diff --git a/tfhe/docs/getting_started/benchmarks/cpu_gpu_integer_benchmark_fheuint64_tuniform_2m64_ciphertext.svg b/tfhe/docs/getting_started/benchmarks/cpu_gpu_integer_benchmark_fheuint64_tuniform_2m64_ciphertext.svg new file mode 100644 index 0000000000..32eca94168 --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/cpu_gpu_integer_benchmark_fheuint64_tuniform_2m64_ciphertext.svg @@ -0,0 +1,100 @@ + + + + Operation \ Size + CPU + GPU + + + + Negation (`-`) + + 106 ms + + 25.2 ms + + Add / Sub (`+`,`-`) + + 105 ms + + 25.2 ms + + Mul (`x`) + + 401 ms + + 237 ms + + Equal / Not Equal (`eq`, `ne`) + + 81.2 ms + + 17.7 ms + + Comparisons (`ge`, `gt`, `le`, `lt`) + + 102 ms + + 26.2 ms + + Max / Min (`max`,`min`) + + 145 ms + + 43.6 ms + + Bitwise operations (`&`, `|`, `^`) + + 20.7 ms + + 5.97 ms + + Div / Rem (`/`, `%`) + + 8.22 s + + 2.05 s + + Left / Right Shifts (`<<`, `>>`) + + 134 ms + + 86.7 ms + + Left / Right Rotations (`left_rotate`, `right_rotate`) + + 133 ms + + 86.8 ms + + Leading / Trailing zeros/ones + + 247 ms + + 62.3 ms + + Log2 + + 267 ms + + 73.9 ms + + + + + + + + + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_ciphertext.svg b/tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_ciphertext.svg new file mode 100644 index 0000000000..6b7831bb41 --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_ciphertext.svg @@ -0,0 +1,230 @@ + + + + Operation \ Size + FheUint4 + FheUint8 + FheUint16 + FheUint32 + FheUint64 + FheUint128 + FheUint256 + + + + Negation (`-`) + + 10.9 ms + + 11.2 ms + + 12.5 ms + + 17.7 ms + + 25.2 ms + + 51.1 ms + + 82.8 ms + + Add / Sub (`+`,`-`) + + 11.0 ms + + 11.3 ms + + 12.5 ms + + 17.7 ms + + 25.2 ms + + 51.2 ms + + 82.8 ms + + Mul (`x`) + + 18.0 ms + + 23.1 ms + + 37.2 ms + + 76.4 ms + + 237 ms + + 830 ms + + 3.24 s + + Equal / Not Equal (`eq`, `ne`) + + 7.53 ms + + 7.65 ms + + 11.5 ms + + 12.4 ms + + 17.7 ms + + 24.1 ms + + 37.7 ms + + Comparisons (`ge`, `gt`, `le`, `lt`) + + 11.1 ms + + 11.4 ms + + 15.3 ms + + 20.1 ms + + 26.2 ms + + 38.0 ms + + 58.3 ms + + Max / Min (`max`,`min`) + + 18.3 ms + + 18.9 ms + + 24.0 ms + + 30.6 ms + + 43.6 ms + + 68.5 ms + + 107 ms + + Bitwise operations (`&`, `|`, `^`) + + 3.45 ms + + 3.6 ms + + 4.01 ms + + 4.58 ms + + 5.97 ms + + 11.2 ms + + 18.9 ms + + Div / Rem (`/`, `%`) + + 78.1 ms + + 154 ms + + 318 ms + + 763 ms + + 2.05 s + + 6.35 s + + 22.8 s + + Left / Right Shifts (`<<`, `>>`) + + 17.7 ms + + 22.9 ms + + 30.4 ms + + 43.4 ms + + 86.7 ms + + 162 ms + + 280 ms + + Left / Right Rotations (`left_rotate`, `right_rotate`) + + 17.7 ms + + 22.9 ms + + 30.3 ms + + 43.4 ms + + 86.8 ms + + 162 ms + + 280 ms + + Leading / Trailing zeros/ones + + 29.0 ms + + 25.1 ms + + 33.8 ms + + 44.4 ms + + 62.3 ms + + 105 ms + + 195 ms + + Log2 + + 31.9 ms + + 35.5 ms + + 48.2 ms + + 55.2 ms + + 73.9 ms + + 113 ms + + 210 ms + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_plaintext.svg b/tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_plaintext.svg new file mode 100644 index 0000000000..635574cd99 --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_plaintext.svg @@ -0,0 +1,162 @@ + + + + Operation \ Size + FheUint4 + FheUint8 + FheUint16 + FheUint32 + FheUint64 + FheUint128 + FheUint256 + + + + Add / Sub (`+`,`-`) + + 11.1 ms + + 11.4 ms + + 12.6 ms + + 17.8 ms + + 25.4 ms + + 51.4 ms + + 83.1 ms + + Mul (`x`) + + 11.4 ms + + 18.1 ms + + 26.1 ms + + 46.6 ms + + 109 ms + + 330 ms + + 1.17 s + + Equal / Not Equal (`eq`, `ne`) + + 7.82 ms + + 7.81 ms + + 8.08 ms + + 12.0 ms + + 13.0 ms + + 18.8 ms + + 25.8 ms + + Comparisons (`ge`, `gt`, `le`, `lt`) + + 9.31 ms + + 9.61 ms + + 13.3 ms + + 17.4 ms + + 22.3 ms + + 29.0 ms + + 41.8 ms + + Max / Min (`max`,`min`) + + 16.5 ms + + 17.3 ms + + 21.9 ms + + 28.1 ms + + 39.7 ms + + 59.4 ms + + 90.7 ms + + Bitwise operations (`&`, `|`, `^`) + + 3.3 ms + + 3.63 ms + + 4.11 ms + + 4.65 ms + + 6.03 ms + + 11.2 ms + + 19.0 ms + + Left / Right Shifts (`<<`, `>>`) + + 3.49 ms + + 3.63 ms + + 4.1 ms + + 4.63 ms + + 6.03 ms + + 11.2 ms + + 19.0 ms + + Left / Right Rotations (`left_rotate`, `right_rotate`) + + 3.5 ms + + 3.63 ms + + 4.11 ms + + 4.64 ms + + 6.03 ms + + 11.3 ms + + 19.0 ms + + + + + + + + + + + + + + + + + + + + + diff --git a/tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md b/tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_operations.md similarity index 52% rename from tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md rename to tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_operations.md index 4674a47f3e..96e7d23eaf 100644 --- a/tfhe/docs/getting_started/benchmarks/gpu_benchmarks.md +++ b/tfhe/docs/getting_started/benchmarks/gpu/gpu_integer_operations.md @@ -1,18 +1,22 @@ -# GPU Benchmarks +# Integer Operations over GPU -This document details the GPU performance benchmarks of homomorphic operations using **TFHE-rs**. +This document details the GPU performance benchmarks of homomorphic operations on integers using **TFHE-rs**. -All GPU benchmarks presented here were obtained on H100 GPUs, and rely on the multithreaded PBS algorithm. The cryptographic parameters `PARAM_GPU_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_3_KS_PBS` were used. +{% hint style="info" %} +All CPU benchmarks were launched on H100 GPUs, and rely on the multithreaded PBS algorithm. +{% endhint %} + +The cryptographic parameters `PARAM_GPU_MULTI_BIT_MESSAGE_2_CARRY_2_GROUP_3_KS_PBS` were used. ## 1xH100 Below come the results for the execution on a single H100. The following table shows the performance when the inputs of the benchmarked operation are encrypted: -{% embed url="https://docs.google.com/spreadsheets/d/1xGWykMa8fZ7RWUjkCl-52FJ-BNge8cB-5CSHrVZ6XRo/edit?usp=sharing" %} +![Sweet table](./gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_ciphertext.svg) The following table shows the performance when the left input of the benchmarked operation is encrypted and the other is a clear scalar of the same size: -{% embed url="https://docs.google.com/spreadsheets/d/1MZfE9c-cQw3yAP55tu0i8uLl4lTAiH9zW3gRFp0ve7s/edit?usp=sharing" %} +![Sweet table](./gpu_integer_benchmark_h100x1_multi_bit_tuniform_2m64_plaintext.svg) ## 2xH100 @@ -26,10 +30,13 @@ The following table shows the performance when the left input of the benchmarked {% embed url="https://docs.google.com/spreadsheets/d/1_8VIoStixns22lQq_RBSjVm-0iFHjJpntQTrvEHZpSg/edit?usp=sharing" %} -## Programmable bootstrapping +## Reproducing TFHE-rs benchmarks -The next table shows the execution time of a keyswitch followed by a programmable bootstrapping depending on the precision of the input message. The associated parameter set is given. +**TFHE-rs** benchmarks can be easily reproduced from the [source](https://github.com/zama-ai/tfhe-rs). -Note that these benchmarks use Gaussian parameters. +The following example shows how to reproduce **TFHE-rs** benchmarks: -{% embed url="https://docs.google.com/spreadsheets/d/1KhElQ7sIsShUSVQw5bKFoP-x5BgMaWh1pZtrVAdC3T4/edit?usp=sharing" %} +```shell +#Integer benchmarks: +make bench_integer_gpu +``` diff --git a/tfhe/docs/getting_started/benchmarks/gpu/summary.md b/tfhe/docs/getting_started/benchmarks/gpu/summary.md new file mode 100644 index 0000000000..2c2e4e319b --- /dev/null +++ b/tfhe/docs/getting_started/benchmarks/gpu/summary.md @@ -0,0 +1,11 @@ +# Benchmarks over GPU + +This document details the GPU performance benchmarks of homomorphic operations using **TFHE-rs**. + +By their nature, homomorphic operations run slower than their cleartext equivalents. + +{% hint style="info" %} +All CPU benchmarks were launched on H100 GPUs, and rely on the multithreaded PBS algorithm. +{% endhint %} + +* [Integer operations](gpu\_integer\_operations.md) diff --git a/tfhe/docs/getting_started/benchmarks/summary.md b/tfhe/docs/getting_started/benchmarks/summary.md index 5652084e9f..dd73d0e308 100644 --- a/tfhe/docs/getting_started/benchmarks/summary.md +++ b/tfhe/docs/getting_started/benchmarks/summary.md @@ -12,4 +12,4 @@ make print_doc_bench_parameters ### Operation time (ms) over FheUint 64 -{% embed url="https://docs.google.com/spreadsheets/d/1OMdGSakEUbIFSEQKhAinTolJjvmPBbafi3DEe3UfzsQ/edit?usp=sharing" %} +![Sweet table](./cpu_gpu_integer_benchmark_fheuint64_tuniform_2m64_ciphertext.svg)