Skip to content

feat(meta-client): add cache crate for databend-meta service #17766

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ members = [
"src/meta/app",
"src/meta/app-types",
"src/meta/binaries",
"src/meta/cache",
"src/meta/client",
"src/meta/control",
"src/meta/ee",
Expand Down Expand Up @@ -141,6 +142,7 @@ databend-common-management = { path = "src/query/management" }
databend-common-meta-api = { path = "src/meta/api" }
databend-common-meta-app = { path = "src/meta/app" }
databend-common-meta-app-types = { path = "src/meta/app-types" }
databend-common-meta-cache = { path = "src/meta/cache" }
databend-common-meta-client = { path = "src/meta/client" }
databend-common-meta-control = { path = "src/meta/control" }
databend-common-meta-embedded = { path = "src/meta/embedded" }
Expand Down
38 changes: 38 additions & 0 deletions src/meta/cache/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
[package]
name = "databend-common-meta-cache"
description = """
A distributed cache implementation that:
- Maintains a local view of data stored in the meta-service
- Automatically synchronizes with the meta-service via watch API
- Provides safe concurrent access with two-level locking
- Handles connection failures with automatic recovery
- Ensures data consistency through sequence number tracking
"""
version = { workspace = true }
authors = { workspace = true }
license = { workspace = true }
publish = { workspace = true }
edition = { workspace = true }

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[lib]
doctest = false
test = true

[dependencies]
databend-common-base = { workspace = true }
databend-common-meta-client = { workspace = true }
databend-common-meta-types = { workspace = true }
futures = { workspace = true }
log = { workspace = true }
thiserror = { workspace = true }
tokio = { workspace = true }
tonic = { workspace = true }

[dev-dependencies]
anyhow = { workspace = true }
pretty_assertions = { workspace = true }

[lints]
workspace = true
94 changes: 94 additions & 0 deletions src/meta/cache/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Databend Common Meta Cache

A distributed cache implementation based on meta-service, providing reliable resource management and data synchronization across distributed systems.


## Features

- **Automatic Synchronization**: Background watcher task keeps local cache in sync with meta-service
- **Concurrency Control**: Two-level concurrency control mechanism for safe access
- **Event-based Updates**: Real-time updates through meta-service watch API
- **Safe Reconnection**: Automatic recovery from connection failures with state consistency

## Key Components

### Cache Structure

```text
<prefix>/foo
<prefix>/..
<prefix>/..
```

- `<prefix>`: User-defined string to identify a cache instance

### Main Types

- `Cache`: The main entry point for cache operations
- Provides safe access to cached data
- `CacheData`: Internal data structure holding the cached values
- `EventWatcher`: Background task that watches for changes in meta-service
- Handles synchronization with meta-service

## Usage

```rust
let client = MetaGrpcClient::try_create(/*..*/);
let cache = Cache::new(
client,
"your/cache/key/space/in/meta/service",
"your-app-name-for-logging",
).await;

// Access cached data
cache.try_access(|c: &CacheData| {
println!("last-seq:{}", c.last_seq);
println!("all data: {:?}", c.data);
}).await?;

// Get a specific value
let value = cache.try_get("key").await?;

// List all entries under a prefix
let entries = cache.try_list_dir("prefix").await?;
```

## Concurrency Control

The cache employs a two-level concurrency control mechanism:

1. **Internal Lock (Mutex)**: Protects concurrent access between user operations and the background cache updater. This lock is held briefly during each operation.

2. **External Lock (Method Design)**: Public methods require `&mut self` even for read-only operations. This prevents concurrent access to the cache instance from multiple call sites. External synchronization should be implemented by the caller if needed.

This design intentionally separates concerns:
- The internal lock handles short-term, fine-grained synchronization with the updater
- The external lock requirement (`&mut self`) enables longer-duration access patterns without blocking the background updater unnecessarily

Note that despite requiring `&mut self`, all operations are logically read-only with respect to the cache's public API.

## Initialization Process

When a `Cache` is created, it goes through the following steps:

1. Creates a new instance with specified prefix and context
2. Spawns a background task to watch for key-value changes
3. Establishes a watch stream to meta-service
4. Fetches and processes initial data
5. Waits for the cache to be fully initialized before returning
6. Maintains continuous synchronization

The initialization is complete only when the cache has received a full copy of the data from meta-service, ensuring users see a consistent view of the data.

## Error Handling

The cache implements robust error handling:

- Connection failures are automatically retried in the background
- Background watcher task automatically recovers from errors
- Users are shielded from transient errors through the abstraction
- The cache ensures data consistency by tracking sequence numbers

## License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Loading
Loading