Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design documentation for adding a raw-FFI thread manager #31

Open
wants to merge 42 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
38791cf
Add raw-ffi design
acarbonetto Nov 1, 2023
29e2244
Update to add Shachars changes
acarbonetto Nov 1, 2023
32b751e
Update design documentation
acarbonetto Nov 20, 2023
799b248
Update docs
acarbonetto Nov 20, 2023
a86e424
Add API design doc
jonathanl-bq Nov 21, 2023
265daa0
Update section on supported commands in API design doc
jonathanl-bq Nov 21, 2023
8d43095
Update API design doc with more details
jonathanl-bq Nov 22, 2023
11ea2c7
Update type handling policy in API design doc
jonathanl-bq Nov 22, 2023
d7325a1
Push update
acarbonetto Nov 22, 2023
fe1690a
Update API design doc with Routing info
jonathanl-bq Nov 24, 2023
744e92f
Add example showing how executeRaw would work to API design doc
jonathanl-bq Nov 24, 2023
56ebe90
Add Redis to Java and Go encoding
acarbonetto Nov 26, 2023
64e034a
Change to supporting RESP2 instead of RESP3 for now
jonathanl-bq Nov 27, 2023
2d688c1
Add go and java-specific language
acarbonetto Nov 27, 2023
f6b702b
Clean up section on supported commands in API design doc
jonathanl-bq Nov 29, 2023
af4e2a4
Fix typo in API design doc
jonathanl-bq Nov 29, 2023
df00c54
Update docs/design-api.md
jonathanl-bq Nov 30, 2023
ce3eb6c
Add some more details to API design
jonathanl-bq Nov 30, 2023
28c672b
Add java design documentation
acarbonetto Dec 13, 2023
91490bc
Add use-cases as examples of using the API
acarbonetto Dec 15, 2023
7058b02
Add more examples; return Type directly
acarbonetto Dec 20, 2023
f339390
Update customCommand use case
acarbonetto Dec 20, 2023
c4a13da
Update transactional use-cases
acarbonetto Dec 20, 2023
8bb74ba
Add Go API documentation
aaron-congo Jan 23, 2024
eece5f9
add missing period
aaron-congo Jan 23, 2024
73ba55b
Address PR feedback
aaron-congo Jan 23, 2024
b88cbbd
Update struct diagram
aaron-congo Jan 24, 2024
30406bf
PR suggestions
aaron-congo Jan 24, 2024
041bd39
Add documentation for the Go API design
aaron-congo Jan 24, 2024
a25467e
Add documentation for the Go FFI design
aaron-congo Jan 26, 2024
0062ed9
Update diagrams so that maps and arrays of Redis values include an en…
aaron-congo Jan 26, 2024
ae090e6
Fix mistakes in the FFI request success struct diagram
aaron-congo Jan 26, 2024
a852c04
Scale up diagrams to be more readable
aaron-congo Jan 27, 2024
50bc303
Address PR feedback
aaron-congo Jan 30, 2024
fdd659a
Increase size of API struct diagram to make it more readable
aaron-congo Jan 30, 2024
c3b7ae6
Update request success struct diagram
aaron-congo Jan 30, 2024
d5edd9d
Update connection sequence diagram
aaron-congo Jan 30, 2024
3bd84ba
Update connection sequence diagram
aaron-congo Jan 30, 2024
26bafc1
Add glide-core to connection sequence diagram
aaron-congo Jan 31, 2024
765d220
Add documentation for the Go FFI design
aaron-congo Jan 31, 2024
969a896
Update Go use cases with current configuration implementation
aaron-congo Feb 24, 2024
3896446
Update Go use cases to use config without pointer fields
aaron-congo Feb 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 76 additions & 0 deletions docs/design-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
API Design

# Client Wrapper API design doc

## API requirements:
- The API will be thread-safe.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be use present instead of future?

- The API will accept as inputs all of [RESP3 types](https://redis.io/docs/reference/protocol-spec/).
- The API will attempt authentication, topology refreshes, reconnections, etc., automatically. In case of failures concrete errors will be returned to the user.

## Command Interface

### Unix Domain Socket solution
For clients based on Unix Domain Sockets (UDS), we will simply use the existing protobuf messages for creating a connection, sending requests, and receiving responses. Supported commands are enumerated in the [protobuf definition for requests](../babushka-core/src/protobuf/redis_request.proto) and we may add more in the future, although the `CustomCommand` request type is also adequate for all commands. As defined in the [protobuf definition for responses](../babushka-core/src/protobuf/response.proto), client wrappers will receive data as a pointer, which can be passed to Rust to marshal the data back into the wrapper language’s native data types.

Transactions will be handled by adding a list of `Command`s to the protobuf request. The response will be a `redis::Value::Bulk`, which should be handled in the same Rust function that marshals the data back into the wrapper language's native data types. This is handled by storing the results in a collection type native to the wrapper language.

When running Redis in Cluster Mode, several routing options will be provided. These are all specified in the protobuf request.

### Raw FFI solution
For clients using a raw FFI solution, in Rust, we will expose a general command that is able to take any command and arguments as strings.

We have 2 options for passing the command, arguments, and any additional configuration to the Rust core from the wrapper language:

#### Protobuf
The wrapper language will pass the commands, arguments, and configuration as protobuf messages using the same definitions as in the UDS solution.

Transactions will be handled by adding a list of `Command`s to the protobuf request. The response will be a `redis::Value::Bulk`, which can be marshalled into a C array of values before being passed from Rust to the wrapper language. The wrapper language is responsible for converting the array of results to its own native collection type.

Pros:
- We get to reuse the protobuf definitions, meaning fewer files to update if we make changes to the protobuf definitions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Redis commands can be presented as a simple string array, so passing protobuf messages from the wrapper to the core adds unnecessary complication when we're talking about a raw FFI solution

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. we'll have a generic execute_command function in rust that excepts a string array and all FFI functions from the wrapper will call it

- May be simpler to implement compared to the C data types solution, since we do not need to define our own C data types

Cons:
- There is additional overhead from marshalling data to and from protobuf, which could impact performance significantly

#### C Data Types
The wrapper language will pass commands, arguments, and configuration as C data types.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please take few examples of complex commands with many arguments, like sorted set or list for example https://lettuce.io/core/release/api/io/lettuce/core/api/sync/RedisSortedSetCommands.html.

Lets try to better undersrtand it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asafpamzn also in the UDS solution we convert all passed arguments into a string array before passing it to the core. so there's no issue when passing complex commands.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to verify that all RESP3 value types can be returned using C data types

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment is addressed now, since we will starting with RESP2 for now.


Transactions will be handled by passing a C array of an array of arguments to Rust from the wrapper language. The response will be a `redis::Value::Bulk`, which can be marshalled in the same way as explained in the protobuf solution.

Pros:
- No additional overhead from marshalling to and from protobuf, so this should perform better
- May be simpler to implement compared to protobuf solution, since it can be tricky to construct protobuf messages in a performant way and we have to add a varint to the messages as well

Cons:
- Would add an additional file to maintain containing the C definitions (only one file though, since we could share between all raw FFI solutions), which we would need to update every time we want to update the existing protobuf definitions

We will be testing both approaches to see which is easier to implement, as well as the performance impact before deciding on a solution.

To marshal Redis data types back into the corresponding types for the wrapper language, we will convert them into appropriate C types, which can then be translated by the wrapper language into its native data types.

## Supported Commands
We will be supporting all Redis commands. Commands with higher usage will be prioritized, as determined by usage numbers from AWS ElastiCache usage logs.

## Command Input and Output Types
Two different methods of sending requests will be supported.
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Two different methods of sending requests will be supported.
Two different methods of sending requests will be considered and compared, and we will go with the more performant solution.


### No Redis Type Validation
We will expose an `executeRaw` method that does no validation of the input types or command on the client side, leaving it up to Redis to reject the request should it be malformed. This gives the user the flexibility to send any type of request they want, including ones not officially supported yet.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add an example

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain what was the decision for type validation? There are two options

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided that we want to determine how much extra overhead protobuf adds to the product. Then compare with a custom payload in C. We can compare and share the solution for all raw-ffi solutions.


### With Redis Type Validation
We will expose separate methods for each supported command, and will attempt to validate the inputs for each of these methods. We may leverage the compiler for the wrapper language to validate the types of the command arguments, or, for non-statically typed languages, we may try to implement this using explicit checks.

## Errors

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference where these errors came from

ClosingError: Errors that report that the client has closed and is no longer usable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ClosingError: Errors that report that the client has closed and is no longer usable.
`ClosingError`: Errors that report that the client has closed and is no longer usable.

And below


RedisError: Errors that were reported during a request.

TimeoutError: Errors that are thrown when a request times out.

ExecAbortError: Errors that are thrown when a transaction is aborted.

ConnectionError: Errors that are thrown when a connection disconnects. These errors can be temporary, as the client will attempt to reconnect.

Errors returned are subject to change as we update the protobuf definitions.

223 changes: 223 additions & 0 deletions docs/design-raw-ffi.md

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename file - it shows both raw FFI and UDS approaches

Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
# Babushka Core Wrappers

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename all references

Suggested change
# Babushka Core Wrappers
# Glide Core Wrappers


## Summary

The Babushka client allows Redis users to connect to Redis using a variety of commands through a thin-client optimized for
various languages. The client uses a performant core to establish and manage connections and communicate with Redis. The thin-client
wrapper talks to the core using an FFI (foreign function interface) to Rust.

The following document discusses two primary communication protocol architectures for wrapping the Babushka clients. Specifically,
it details how Java-Babushka and Go-Babushka each use a different protocol and describes the advantages of each language-specific approach.

# Unix Domain Socket Manager Connector

## High-Level Design

**Summary**: The Babushka "UDS" solution uses a socket listener to manage rust-to-wrapper worker threads, and unix domain sockets
to deliver command requests between the wrapper and redis-client threads. This works well because we allow the unix sockets to pass messages and manage threads
through the OS, and unix sockets are very performant. This results in simple/fast communication. The risk to avoid is that
unix sockets can become a bottleneck for data-intensive commands, and the library can spend too much time waiting on I/O
blocking operations.

```mermaid
stateDiagram-v2
direction LR

Wrapper: Wrapper
UnixDomainSocket: Unix Domain Socket
RustCore: Rust-Core

[*] --> Wrapper: User
Wrapper --> UnixDomainSocket
UnixDomainSocket --> Wrapper
RustCore --> UnixDomainSocket
UnixDomainSocket --> RustCore
RustCore --> Redis
Redis --> RustCore
```

## Decision to use UDS Sockets for a Java-Babushka Wrapper

The decision to use Unix Domain Sockets (UDS) to manage the Java-wrapper to Babushka Redis-client communication was thus:
1. Java contains an efficient socket protocol library ([netty.io](https://netty.io/)) that provides a highly configurable environment to manage sockets.
2. Java objects serialization/de-serialization is an expensive operation, and a performing multiple io operations between raw-ffi calls would be inefficient.
3. The async FFI requests with callbacks requires that we manage multiple runtimes (Rust and Java Thread management), and JNI does not provide an out-of-box solution for this.

### Decision Log

| Protocol | Details | Pros | Cons |
|----------------------------------------------|-------------------------------------------------------------|-----------------------------|----------------------------------------------------|
| Unix Domain Sockets (jni/netty) | JNI to submit commands; netty.io for message passing; async | netty.io standard lib; | complex configuration; limited by socket interface |
| Raw-FFI (JNA, uniffi-rs, j4rs, interoptopus) | FFI to submit commands; Rust for message processing | reusable in other languages | slow performance and uses JNI under the hood |
| Panama/jextract | Performance similar to a raw-ffi using JNI | modern | lacks early Java support (JDK 18+); prototype |

### Sequence Diagram

```mermaid
sequenceDiagram

participant Wrapper as Client-Wrapper
participant ffi as FFI
participant manager as Rust-Core
participant worker as Tokio Worker
participant SocketListener as Socket Listener
participant Socket as Unix Domain Socket
participant Client as Redis

activate Wrapper
activate Client
Wrapper -)+ ffi: connect_to_redis
ffi -)+ manager: start_socket_listener(init_callback)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ffi -)+ manager: start_socket_listener(init_callback)
ffi -)+ manager: start_socket_listener

manager -) worker: Create Tokio::Runtime (count: CPUs)
activate worker
worker ->> SocketListener: listen_on_socket(init_callback)
SocketListener ->> SocketListener: loop: listen_on_client_stream
activate SocketListener
SocketListener -->> manager:
manager -->> ffi: socket_path
ffi -->>- Wrapper: socket_path
SocketListener -->> Socket: UnixStreamListener::new
activate Socket
SocketListener -->> Client: BabushkaClient::new
Wrapper ->> Socket: connect
Socket -->> Wrapper:
loop single_request
Wrapper ->> Socket: netty.writeandflush (protobuf.redis_request)
Socket -->> Wrapper:
Wrapper ->> Wrapper: wait
SocketListener ->> SocketListener: handle_request
SocketListener ->> Socket: read_values_loop(client_listener, client)
Socket -->> SocketListener:
SocketListener ->> Client: send(request)
Client -->> SocketListener: ClientUsageResult
SocketListener ->> Socket: write_result
Socket -->> SocketListener:
Wrapper ->> Socket: netty.read (protobuf.response)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is not correct. May be we need to split diagram into 2 or 3 ones: java to UDS, UDS to rust and java-uds-rust in zoom out mode.

Socket -->> Wrapper:
Wrapper ->> Wrapper: Result
end
Wrapper ->> Socket: close()
Wrapper ->> SocketListener: shutdown
SocketListener ->> Socket: close()
deactivate Socket
SocketListener ->> Client: close()
SocketListener -->> Wrapper:
deactivate SocketListener
deactivate worker
deactivate Wrapper
deactivate Client
```

### Elements
* **Wrapper**: Our Babushka wrapper that exposes a client API (java, python, node, etc)
* **Babushka FFI**: Foreign Function Interface definitions from our wrapper to our Rust Babushka-Core
* **Babushka impl**: public interface layer and thread manager
* **Tokio Worker**: Tokio worker threads (number of CPUs)
* **SocketListener**: listens for work from the Socket, and handles commands
* **Unix Domain Socket**: Unix Domain Socket to handle incoming requests and response payloads between Rust-Core and Wrapper
* **Redis**: Our data store

## Wrapper-to-Core Connector with raw-FFI calls

**Summary**: Foreign Function Interface (FFI) calls are simple to implement, cross-language calls. The setup between Golang and the Rust-core
is fairly simple using the well-supported CGO library. While sending language calls is easy, setting it up in an async manner
requires that we handle async callbacks. Golang has a simple, light-weight solution to that, using goroutines and channels,
to pass callbacks and execution between the languages.

```mermaid
stateDiagram-v2
direction LR

Wrapper: Golang Wrapper
FFI: Foreign Function Interface
RustCore: Rust-Core

[*] --> Wrapper: User
Wrapper --> FFI
FFI --> Wrapper
RustCore --> FFI
FFI --> RustCore
RustCore --> Redis
```

## Decision to use Raw-FFI calls directly to Rust-Core for Golang Wrapper

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move go to another doc too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather hold off on splitting until we have an idea of where these documents will be stored.


### Decision Log

The decision to use raw FFI request from Golang to Rust-core was straight forward:
1. Golang contains goroutines as an alternative, lightweight, and performant solution serves as an obvious solution to pass request, even at scale.

Due to lightweight thread management solution, we chose a solution that scales quickly and requires less configuration to achieve a performant solution
on par with existing industrial standards ([go-redis](https://github.com/redis/go-redis)).

| Protocol | Details | Pros | Cons |
|--------------------------|---------|--------------------------------------------------------|--------------------------------------|
| Unix Domain Sockets | | UDS performance; consistent protocol between languages | complex configuration |
| Raw-FFI (CGO/goroutines) | | simplified and light-weight interface | separate management for each request |

## Sequence Diagram - Raw-FFI Client

**Summary**: If we make direct calls through FFI from our Wrapper to Rust, we can initiate commands to Redis. This allows us
to make on-demand calls directly to Rust-core solution. Since the calls are async, we need to manage and populate a callback
object with the response and a payload.

We will need to avoid busy waits while waiting on the async response. The wrapper and Rust-core languages independently track
threads. On the Rust side, they use a Tokio runtime to manage threads. When the Rust-core is complete, and returning a Response,
we can use the Callback object to re-awake the wrapper thread manager and continue work.

Go routines have a performant solution using light-weight go-routines and channels. Instead of busy-waiting, we awaken by
pushing goroutines to the result channel once the Tokio threads send back a callback.

### Sequence Diagram


```mermaid
sequenceDiagram

participant Wrapper as Client-Wrapper
participant channel as Result Channel
participant ffi as Babushka FFI
participant manager as Babushka impl
participant worker as Tokio Worker
participant Client as Redis

activate Wrapper
activate Client
Wrapper -)+ ffi: create_connection(connection_settings)
ffi ->>+ manager: start_thread_manager(init_callback)
manager ->> worker: Create Tokio::Runtime (count: CPUs)
activate worker
manager -->> Wrapper: Ok(BabushkaClient)
worker ->> Client: BabushkaClient::new
worker ->> worker: wait_for_work(init_callback)

loop single_request
Wrapper ->> channel: make channel
activate channel
Wrapper -) ffi: command: single_command(protobuf.redis_request, &channel)
Wrapper ->> channel: wait
ffi ->> manager: cmd(protobuf.redis_request)
manager ->> worker: command: cmd(protobuf.redis_request)
worker ->> Client: send(command, args)
Client -->> worker: Result
worker -->> ffi: Ok(protobuf.response)
ffi -->> channel: Ok(protobuf.response)
channel ->> Wrapper: protobuf.response
Wrapper ->> channel: close
deactivate channel
end

Wrapper -) worker: close_connection
worker -->> Wrapper:
deactivate worker
deactivate Wrapper
deactivate Client
```

### Elements
* **Client-Wrapper**: Our Babushka wrapper that exposes a client API (Go, etc)
* **Result Channel**: Goroutine channel on the Babushka Wrapper
* **Babushka FFI**: Foreign Function Interface definitions from our wrapper to our Rust Babushka-Core
* **Babushka impl**: public interface layer and thread manager
* **Tokio Worker**: Tokio worker threads (number of CPUs)
* **Redis**: Our data store