-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documenting memory usage/management by BatchLogProcessor #2469
base: main
Are you sure you want to change the base?
Changes from all commits
6c84650
1b5f52b
76973f4
67a1b32
ba36c8f
403a412
1282bfc
24797bb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -263,7 +263,94 @@ type LogsData = Box<(LogRecord, InstrumentationScope)>; | |
/// let provider = LoggerProvider::builder() | ||
/// .with_log_processor(processor) | ||
/// .build(); | ||
/// ``` | ||
/// | ||
/// **Memory Management in BatchLogProcessor** | ||
/// | ||
/// The `BatchLogProcessor` manages memory through the following stages of log processing: | ||
/// | ||
/// 1. **Record Ingestion**: | ||
/// - Each `LogRecord` is **cloned** upon entering the processor. | ||
/// - `LogRecordAttributes` utilize a hybrid memory model: | ||
/// - First 5 attributes are **stack-allocated**. | ||
/// - Adding additional attributes trigger **heap allocation** in a dynamically growing vector. | ||
/// - The `LogRecord` and its associated `InstrumentationScope` are **boxed together** | ||
/// to allocate them on the heap before entering the queue. This means: | ||
/// - The `LogRecord`'s inline attributes (if any) are moved to the heap as part of the boxed structure. | ||
/// - Any dynamically allocated data already on the heap (e.g., strings, overflow attributes) remains unaffected. | ||
/// - Ownership of the boxed data is transferred to the queue, ensuring it can be processed independently of the original objects. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure I follow this.. The data inside box is already a clone, so it is already independent of originals right? |
||
/// | ||
/// 2. **Queue Management**: | ||
/// - Uses **two bounded synchronous channels** (`sync_channel`): | ||
/// - One for **log records** (`logs_sender` and `logs_receiver`). | ||
/// - Another for **control messages** (`message_sender` and `message_receiver`). | ||
/// | ||
/// - **Log Record Queue**: | ||
/// - Stores log records as **heap-allocated** `Box<(LogRecord, InstrumentationScope)>`. | ||
/// - The queue size is configurable and defined by `max_queue_size`. | ||
/// - If the queue is full: | ||
/// - New log records are **dropped**, and a warning is logged the first time this happens. | ||
/// - Dropped records are counted for reporting during shutdown. | ||
/// | ||
/// - **Control Message Queue**: | ||
/// - Stores control messages (`BatchMessage`) to manage operations like exporting, force flushing, setting resources, and shutting down. | ||
/// - The control message queue has a fixed size (e.g., 64 messages). | ||
/// - Control messages are processed with higher priority, ensuring operational commands are handled promptly. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is there anything we do to process them with higher priority? This does not match the implementation. |
||
/// - The use of a separate control queue ensures that critical commands, such as `Shutdown`, are not lost even if the log record queue is full. | ||
/// - Messages supported include: | ||
/// - `ExportLog`: Triggers an immediate export of log records. | ||
/// - `ForceFlush`: Flushes all buffered log records to the exporter. | ||
/// - `SetResource`: Updates the exporter with a new resource. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is misleading. There is no ability to update a resource. This could be misinterpreted as Resource can be changed and processor will be informed of the same, which is not true. |
||
/// - `Shutdown`: Cleans up and flushes logs before terminating the processor. | ||
/// | ||
/// 3. **Worker Thread Storage**: | ||
/// - The worker thread maintains a pre-allocated `Vec` of boxed record pairs: | ||
/// - The vector’s capacity is fixed at `max_export_batch_size`. | ||
/// - Records are **moved** (not cloned) from the log record queue to the vector for processing. | ||
/// | ||
/// 4. **Export Process**: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this looks too much details for end users to care about! |
||
/// - During the export process: | ||
/// - The worker thread retrieves records from the log record queue until `max_export_batch_size` is reached or the queue is empty. | ||
/// - The retrieved records are processed in batches and passed to the exporter. | ||
/// - The exporter's `export()` method receives references to the log records and `InstrumentationScope`. | ||
/// - If the exporter requires retaining the log records (e.g., for retries or asynchronous operations), it must **clone** the records inside the `export()` implementation. | ||
/// - After successful export: | ||
/// - The original boxed records are dropped, releasing heap memory. | ||
/// - Export is triggered in the following scenarios: | ||
/// - When the batch size reaches `max_export_batch_size`, resulting in `ExportLog` control message being sent to the worker thread. | ||
/// - When the scheduled delay timer expires. | ||
/// - When `force_flush` is called by the application, resulting in a `ForceFlush` control message being sent to the worker thread. | ||
/// - During processor shutdown initiated by the application, resulting in a `Shutdown` control message being sent to the worker thread. | ||
/// - Generation of `ExportLog` control message: | ||
/// - The `ExportLog` control message is generated by the application thread when a new record is added to the log record queue, and the current batch size reaches `max_export_batch_size`. | ||
/// - To prevent redundant messages, the `ExportLog` message is only sent if the previous one has been processed by the worker thread. | ||
/// - Upon receiving this message, the worker thread immediately processes and exports the current batch, overriding any scheduled delay. | ||
/// | ||
/// 5. **Memory Limits**: | ||
/// - **Worst-Case Memory Usage**: | ||
/// - **Log Record Queue Memory** = `max_queue_size * size of boxed (LogRecord + InstrumentationScope)`. | ||
/// - **Batch Memory** = `max_export_batch_size * size of boxed (LogRecord + InstrumentationScope)`. | ||
/// - **Control Message Queue Memory**: | ||
/// - Fixed at 64 messages, with negligible memory overhead. | ||
/// - **Total Maximum Memory:** | ||
/// - When both the log record queue and batch vector are full: | ||
/// ``` | ||
/// (max_queue_size + max_export_batch_size) * size of boxed (LogRecord + InstrumentationScope) | ||
/// ``` | ||
/// - The average size of a `LogRecord` is ~300 bytes ( assuming 4 attributes), and the `InstrumentationScope` is ~50 bytes assuming no attributes. | ||
/// - For `max_queue_size = 2048` and `max_export_batch_size = 512`, the total memory usage is ~900 KB as below: | ||
/// Calculation: `(2048 + 512) * (300 + 50) = 2560 * 350 = 896000 bytes = 896 KB`. | ||
/// | ||
/// 6. **Key Notes on Memory Behavior**: | ||
/// - Boxing a `LogRecord` and `InstrumentationScope` moves the record to the heap, | ||
/// including stack-allocated attributes. | ||
/// - During the export process, records are moved from the log record queue to the worker thread’s vector. | ||
/// - No additional cloning or copying occurs during the export process, minimizing memory overhead while ensuring efficient handling of log records. | ||
/// | ||
/// 7. **Control Queue Prioritization**: | ||
/// - Control messages take precedence over log record processing to ensure timely execution of critical operations. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This does not look like how BatchProcessor works. |
||
/// - For instance, a `Shutdown` message is processed before continuing with log exports, guaranteeing graceful cleanup. | ||
/// - The use of a separate control queue ensures responsiveness to operational commands without the risk of losing critical messages, even if the log record queue is full. | ||
pub struct BatchLogProcessor { | ||
logs_sender: SyncSender<LogsData>, // Data channel to store log records and instrumentation scopes | ||
message_sender: SyncSender<BatchMessage>, // Control channel to store control messages for the worker thread | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How attributes are stored inside LogRecord is not a concern of BatchLogProcessor. It can be documented (and should be!), but not in processor doc.