Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PS-9715 [DOCS] - Update encryption functions for readability - 8.0 #488

Draft
wants to merge 1 commit into
base: 8.0
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 101 additions & 32 deletions docs/encryption-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,51 +40,102 @@ Percona Server for MySQL 8.0.41 adds the following:

Percona Server for MySQL 8.0.28-20 adds encryption functions and variables to manage the encryption range.

## Charset Awareness
## Character sets and component encryption UDFs

All component_encryption_udf functions now handle character sets intelligently:
The `component_encryption_udf` functions handle character sets automatically, making them easy to use.

• Algorithms, digest names, padding schemes, keys, and parameters in PEM format: Automatically converted to the ASCII charset at the MySQL level before passing to the functions.
* Input handling:
* Algorithms, digest names, padding, keys, and pem parameters are converted to ascii.

* Messages, data blocks, and signatures are converted to binary.

* Output handling:
* Pem format results are returned in ascii.

* Digest, encryption, decryption, and signing results are returned in binary.

• Messages, data blocks, and signatures used for digest calculation, encryption, decryption, signing, or verification: Automatically converted to the binary charset at the MySQL level before passing to the functions.
## Using external pem keys

• Function return values in PEM format: Assigned the ASCII charset.
You can use keys generated by openssl with these functions, offering greater flexibility.

• Function return values for operations like digest calculation, encryption, decryption, and signing: Assigned the binary charset.
## Digests vs. encryption

## Use user-defined functions
* Digests (hashes):

* Used to verify data integrity.

* Can be signed for authenticity.

* Cannot be used to recover the original data.

* Encryption:

* Used to make data unreadable without the key.

* Can be decrypted to recover the original data.

You can also use the user-defined functions with the PEM format keys generated externally by the OpenSSL utility.
## Key length considerations

A digest uses plaintext and generates a hash value. This hash value can verify if the plaintext is unmodified. You can also sign or verify on digests to ensure that the original plaintext was not modified. You cannot decrypt the original text from the hash value.
Choose key lengths by balancing security and performance:

When choosing key lengths, consider the following:
* Longer keys increase security but slow down key generation.

* Encryption strength increases with the key size and also the key generation time.
* Symmetric encryption is faster than asymmetric.

* If performance is important and the functions are frequently used, use symmetric encryption. Symmetric encryption functions are faster than asymmetric encryption functions. Moreover, asymmetric encryption restricts the maximum length of a message being encrypted. For example, the algorithm's maximum message size for RSA is the key length in bytes (key length in bits / 8) minus 11.
* Asymmetric encryption (like rsa) limits message size (for example, rsa: key length in bytes - 11).

Choose the right key length based on your application's needs.


## Functions

The following table and sections describe the functions. For examples, see function examples.

| Function Name |
|----------------------------------------------------------------------------------------------------------------------------------|
| [asymmetric_decrypt(algorithm, crypt_str, key_str)](#asymmetric_decryptalgorithm-crypt_str-key_str) |
| [asymmetric_derive(pub_key_str, priv_key_str)](#asymmetric_derivepub_key_str-priv_key_str) |
| [asymmetric_encrypt(algorithm, str, key_str)](#asymmetric_encryptalgorithm-str-key_str) |
| [asymmetric_sign(algorithm, digest_str, priv_key_str, digest_type)](#asymmetric_signalgorithm-digest_str-priv_key_str-digest_type) |
| [asymmetric_verify(algorithm, digest_str, sig_str, pub_key_str, digest_type)](#asymmetric_verifyalgorithm-digest_str-sig_str-pub_key_str-digest_type) |
| [create_asymmetric_priv_key(algorithm, (key_len | dh_parameters))](#create_asymmetric_priv_keyalgorithm-key_len--dh_parameters) |
| [create_asymmetric_pub_key(algorithm, priv_key_str)](#create_asymmetric_pub_keyalgorithm-priv_key_str) |
| [create_dh_parameters(key_len)](#create_dh_parameterskey_len) |
| [create_digest(digest_type, str)](#create_digestdigest_type-str) |

The following table describes the encryption threshold variables, which can be used to set the maximum value for a key length based on the type of encryption used.

| Variable Name |
|-----------------------------------|
| [encryption_udf.dh_bits_threshold](#encryption_udfdh_bits_threshold) |
| [encryption_udf.dsa_bits_threshold](#encryption_udfdsa_bits_threshold) |
| [encryption_udf.rsa_bits_threshold](#encryption_udfrsa_bits_threshold) |
### Asymmetric encryption functions

| Function Name | Description |
| --- | --- |
| [asymmetric_encrypt](#asymmetric_encrypt) | Encrypts a string using an algorithm and a key string. |
| [asymmetric_decrypt](#asymmetric_decrypt) | Decrypts an encrypted string using an algorithm and a key string. |

### Asymmetric key management functions

| Function Name | Description |
| --- | --- |
| [create_asymmetric_priv_key](#create_asymmetric_priv_key) | Generates a private key using a given algorithm and key length. |
| [create_asymmetric_pub_key](#create_asymmetric_pub_key) | Derives a public key from a given private key using a given algorithm. |

### Digital Signature functions

| Function Name | Description |
| --- | --- |
| [asymmetric_sign](#asymmetric_sign) | Signs a digest string using a private key string. |
| [asymmetric_verify](#asymmetric_verify) | Verifies whether a signature string matches a digest string. |

### Diffie-Hellman functions

| Function Name | Description |
| --- | --- |
| [create_dh_parameters](#create_dh_parameters) | Creates parameters for generating a Diffie-Hellman private/public key pair. |
| [asymmetric_derive](#asymmetric_derive) | Derives a symmetric key using a public key and a private key. |

### Digest functions

| Function Name | Description |
| --- | --- |
| [create_digest](#create_digest) | Creates a digest from a given string using a given digest type. |

### Encryption threshold variables

The encryption threshold variables are used to set the maximum value for a key length based on the type of encryption used. These variables provide a way to control and limit the key length for different encryption algorithms.

| Variable Name | Description | Default Value | Range |
| --- | --- | --- | --- |
| [encryption_udf.dh_bits_threshold](#encryption_udfdh_bits_threshold) | Sets the maximum limit for Diffie-Hellman key length | 10000 | 1024-10000 |
| [encryption_udf.dsa_bits_threshold](#encryption_udfdsa_bits_threshold) | Sets the maximum limit for DSA key length | 9984 | 1024-9984 |
| [encryption_udf.rsa_bits_threshold](#encryption_udfrsa_bits_threshold) | Sets the maximum limit for RSA key length | 16384 | 1024-16384 |
| [encryption_udf.legacy_padding](#encryption_udflegacy_padding) | Enables or disables the legacy padding scheme for certain encryption operations | OFF | Boolean |


## Install component_encryption_udf

Expand All @@ -104,7 +155,9 @@ mysql> INSTALL COMPONENT 'file://component_encryption_udf';

## User-defined functions described

## asymmetric_decrypt(*algorithm, crypt_str, key_str*)
<!--## asymmetric_decrypt(*algorithm, crypt_str, key_str*)-->

## asymmetric_decrypt(*algorithm, crypt_str, key_str*){asymmetric_decrypt}

Decrypts an encrypted string using the algorithm and a key string.

Expand Down Expand Up @@ -408,6 +461,22 @@ The padding schemes have the following limitations:
| `no` | The message length must exactly match your RSA key size in bytes. For example, if your key is 1024 bits (128 bytes), the message must also be 128 bytes. If it doesn’t match, it will cause an error. |
| `pkcs1` | Your message can be equal to or smaller than the RSA key size - 11 bytes. For instance, with a 1024-bit RSA key, your message can’t be longer than 117 bytes.|

<details>
<summary>Length Limitation</summary>
When using the PKCS#1 padding scheme with an RSA key, it is essential to consider the limitations on the length of the message that can be encrypted. The maximum length of the message is determined by the size of the RSA key, minus a fixed overhead of 11 bytes, which is used to store the padding bytes.

To calculate the maximum length of the message, use the following formula:
**max_message_length = RSA_key_size_in_bytes - 11**

For example, if you have a 1024-bit RSA key, the key size in bytes is calculated as follows:
**1024 bits / 8 bits per byte = 128 bytes**

Using the formula above, the maximum message length is:
**max_message_length = 128 bytes - 11 bytes = 117 bytes**

This means that when using a 1024-bit RSA key with the PKCS#1 padding scheme, the message cannot exceed 117 bytes. If the message is longer, you will need to use a larger RSA key or a different padding scheme.
</details>

Similarly, `asymmetric_sign()` and `asymmetric_verify()` also have an optional `padding` parameter, either `pkcs1` or `pkcs1_pss`. If not explicitly set, it follows the default based on [`encryption_udf.legacy_padding_scheme`](#encryption_udf.legacy_padding_scheme). You can only use the padding parameter with RSA algorithms.

#### Additional resources
Expand Down
35 changes: 35 additions & 0 deletions docs/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@

Set of properties that guarantee database transactions are processed reliably. Stands for [Atomicity](#atomicity), [Consistency](#consistency), [Isolation](#isolation), [Durability](#durability).

## Asymmetric key

A pair of keys used for cryptographic purposes, consisting of a private key and a corresponding public key. The private key is used for decrypting or signing, while the public key is used for encrypting or verifying.

## Atomicity

Atomicity means that database operations are applied following a “all or nothing” rule. A transaction is either fully applied or not at all.
Expand All @@ -12,6 +16,21 @@ Atomicity means that database operations are applied following a “all or nothi

Consistency means that each transaction that modifies the database takes it from one consistent state to another.

## Digest

A digital fingerprint of a piece of data, such as a string or a file, produced by a hash function. Digests are used to verify the integrity of data and ensure it has not been tampered with or altered.

## Digest string
The string representation of a digest, often in hexadecimal format.

## Digital signature

A cryptographic mechanism used to verify the authenticity and integrity of a message, software, or document. It ensures that the data comes from the claimed source and has not been altered during transmission.

## Diffie-Hellman key exchange

A cryptographic protocol that allows two parties to establish a shared secret key over an insecure communication channel without actually exchanging the key.

## Durability

Once a transaction is committed, it will remain so.
Expand All @@ -28,6 +47,10 @@ A referential constraint between two tables. Example: A purchase order in the pu

A finalized version of the product which is made available to the general public. It is the final stage in the software release cycle.

## Hash function

A one-way mathematical function that takes input data of any size and produces a fixed-size string of characters, known as a digest or hash value. Hash functions are used to create digital fingerprints of data.

## Isolation

The Isolation requirement means that no transaction can interfere with another.
Expand Down Expand Up @@ -86,10 +109,22 @@ Non-Uniform Memory Access ([NUMA](https://en.wikipedia.org/wiki/Non-Uniform_Memo

The Percona branch of [MySQL](#mysql) with performance and management improvements.

## Private key

A secret key used in asymmetric cryptography for decrypting or signing data. It is typically kept secure and not shared with others.

## Public key

A publicly available key used in asymmetric cryptography for encrypting or verifying data. It is typically shared with others and used in conjunction with a private key.

## Storage Engine

A storage engine is a piece of software that implements the details of data storage and retrieval for a database system. This term is primarily used within the [MySQL](#mysql) ecosystem due to it being the first widely used relational database to have an abstraction layer around storage. It is analogous to a Virtual File System layer in an Operating System. A VFS layer allows an operating system to read and write multiple file systems (e.g. FAT, NTFS, XFS, ext3) and a Storage Engine layer allows a database server to access tables stored in different engines (for example, [MyISAM](#myisam) or InnoDB).

## Symmetric key

A single key used for both encrypting and decrypting data in symmetric cryptography. Symmetric keys are typically kept secret and shared between parties.

## Tech Preview

A tech preview item can be a feature, a variable, or a value within a variable. Before using this feature in production, we recommend that you test restoring production from physical backups in your environment and also use an alternative backup method for redundancy. A tech preview item is included in a release for users to provide feedback. The item is either updated, released as [general availability(GA)](#general-availability-ga), or removed if not useful. The functionality can change from tech preview to GA.
Expand Down