Add import log classes and utils #2591

inv-jishnu · 2025-04-11T03:48:45Z

Description

In this PR I have added the classes and util files to add logs for the import process. The generated logs files are summary logs, success logs and failure logs

Related issues and/or PRs

Please review this PR once the following PR is reviewed and merged and the master is then merged to this branch.

Add the import process implementation for data loader #2462

Changes made

Added log classes and utils for import module. The log implementations are primarily single, and split log mode. The single log only generates a single summary, failure and success log file for the import while the split log mode generates a summary, failure and success log for each data chunk.

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes.
Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
Tests (unit, integration, etc.) have been added for the changes.
My changes generate no new warnings.
Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

Road map to merge remaining data loader core files. Current status

General
- Add ScalarDB Dao and related files - Add ScalarDB Dao and related files #2417
- TableMetadataService(partially replaced by ConsensusUtils): Add table metadata service #2434
Export
- Export options and validations: Add export options validator #2435
- ProducerTasks: 1 PR incoming
Import
- Dto classes and utilities: Add data chunk and task result enums and dtos #2442
- Import processor and task code: 2-3 PRs incoming
  - Add dtos and other classes for task #2446
  - Add the import process implementation for data loader #2462
- Code for Import transaction batch and data chunk import: 1 PR Incoming
- ControlFile related Dtos: Add Control file module files and validation #2445
- Import logger: Add import log classes and utils #2591

Release notes

NA

komamitsu · 2025-04-17T02:14:43Z

...er/core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/AbstractImportLogger.java

+              .rowNumber(taskResult.getRowNumber());
+
+      // Only add the raw record if the configuration is set to log raw source data
+      if (config.isLogRawSourceRecords()) {


config.isLogRawSourceRecords()

config.logRawSourceRecordsEnabled() or config.isLogRawSourceRecordsEnabled() or config.shouldLogRawSourceRecords() is a bit better?

@inv-jishnu How about this?

@brfrn169 san,
I had renamed it again in0f34395 now to isLogRawSourceRecordsEnabled from isLogRawSourceRecords. Sorry for the confusion.
Thank you.

komamitsu · 2025-04-17T02:31:21Z

.../core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/SingleFileImportLogger.java

+  private LogWriter successLogWriter;
+  private LogWriter failureLogWriter;


Looks like these 2 log writers can be final.

komamitsu · 2025-04-17T02:33:44Z

.../core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/SingleFileImportLogger.java

+   * @throws IOException if an I/O error occurs while writing to the log
+   */
+  private void logDataChunkSummary(ImportDataChunkStatus dataChunkStatus) throws IOException {
+    if (summaryLogWriter == null) {


Is this method called by multiple threads? If so, this initialization for summaryLogWriter requires something like synchronized block.

komamitsu · 2025-04-17T02:34:43Z

.../core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/SingleFileImportLogger.java

+   * been completed.
+   */
+  private void closeAllLogWriters() {
+    closeLogWriter(summaryLogWriter);


Null check isn't needed?

@komamitsu san,
The null check is added in called closeLogWriter method.on AbstractImportLogger class. Should I move it from there to here?

Oh, I see. It sounds good 👍 . (closeLogWriterIfNeeded or something might be clearer in this case, though.)

komamitsu · 2025-04-17T02:36:59Z

.../core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/SingleFileImportLogger.java

+      throws IOException {
+    String logFileName = batchResult.isSuccess() ? SUCCESS_LOG_FILE_NAME : FAILURE_LOG_FILE_NAME;
+    LogWriter logWriter = batchResult.isSuccess() ? successLogWriter : failureLogWriter;
+    if (logWriter == null) {


Both log writers are not null since they are created in the constructor, right?

komamitsu · 2025-04-17T02:47:27Z

...src/main/java/com/scalar/db/dataloader/core/dataimport/log/SplitByDataChunkImportLogger.java

+      case SUMMARY:
+        logWriterMap = summaryLogWriters;
+        break;
+    }


Can you add default: throw new AssertionError() just in case?

komamitsu · 2025-04-17T02:49:53Z

...src/main/java/com/scalar/db/dataloader/core/dataimport/log/SplitByDataChunkImportLogger.java

+  private LogWriter initializeLogWriterIfNeeded(LogFileType logFileType, int dataChunkId)
+      throws IOException {
+    Map<Integer, LogWriter> logWriters = getLogWriters(logFileType);
+    if (!logWriters.containsKey(dataChunkId)) {


Is this method called by multiple threads? If so, it needs to be thread safe.

komamitsu · 2025-04-17T02:53:07Z

...src/main/java/com/scalar/db/dataloader/core/dataimport/log/SplitByDataChunkImportLogger.java

+    try {
+      writeImportTaskResultDetailToLogs(taskResult);
+    } catch (IOException e) {
+      LOGGER.error("Failed to write success/failure logs");


Suggested change

LOGGER.error("Failed to write success/failure logs");

LOGGER.error("Failed to write success/failure logs", e);

I'm not sure what's the expected behavior, but don't need to throw an exception to tell the user the problem? Just outputting error log is enough?

komamitsu · 2025-04-17T02:57:21Z

...re/src/main/java/com/scalar/db/dataloader/core/dataimport/log/writer/LocalFileLogWriter.java

+   * @throws IOException if an I/O error occurs while writing the record
+   */
+  @Override
+  public void write(JsonNode sourceRecord) throws IOException {


Suggested change

public void write(JsonNode sourceRecord) throws IOException {

public void write(@Nullable JsonNode sourceRecord) throws IOException {

komamitsu · 2025-04-17T03:01:18Z

...re/src/main/java/com/scalar/db/dataloader/core/dataimport/log/writer/LocalFileLogWriter.java

+    if (sourceRecord == null) {
+      return;
+    }
+    synchronized (logWriter) {


ObjectMapper is fully thread-safe. So, I don't think this synchronization is needed.

Copilot

Pull Request Overview

This PR adds new logging classes and utilities for the import module that generate summary, success, and failure log files, supporting both single and split log file modes.

Introduces implementations for log writing (LocalFileLogWriter, DefaultLogWriterFactory) and corresponding logger classes (SingleFileImportLogger, SplitByDataChunkImportLogger).
Updates and adds comprehensive tests for the new logging functionalities.

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
data-loader/core/src/test/java/com/scalar/db/dataloader/core/dataimport/log/writer/DefaultLogWriterFactoryTest.java	Test for verifying creation of LocalFileLogWriter.
data-loader/core/src/test/java/com/scalar/db/dataloader/core/dataimport/log/SplitByDataChunkImportLoggerTest.java	Tests for split log mode behavior and file creation.
data-loader/core/src/test/java/com/scalar/db/dataloader/core/dataimport/log/SingleFileImportLoggerTest.java	Tests for single file log mode functionality.
data-loader/core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/writer/*.java	New log writer interfaces and implementations.
data-loader/core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/*.java	New logger implementations and configuration for import logging.
data-loader/core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/ImportLoggerConfig.java	Added immutable configuration for import logging.
data-loader/core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/AbstractImportLogger.java	Base class for import loggers with common logging functionality.

Copilot · 2025-04-22T01:50:51Z

...er/core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/AbstractImportLogger.java

+              .dataChunkId(taskResult.getDataChunkId())
+              .rowNumber(taskResult.getRowNumber());


The ImportTaskResult builder is calling rowNumber(taskResult.getRowNumber()) twice; remove the duplicate call to avoid potential confusion and ensure clean builder initialization.

Suggested change

.dataChunkId(taskResult.getDataChunkId())

.rowNumber(taskResult.getRowNumber());

.dataChunkId(taskResult.getDataChunkId());

brfrn169

Left a few questions. PTAL!

brfrn169 · 2025-04-30T13:47:10Z

...er/core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/AbstractImportLogger.java

+              .rowNumber(taskResult.getRowNumber());
+
+      // Only add the raw record if the configuration is set to log raw source data
+      if (config.isLogRawSourceRecords()) {


@inv-jishnu How about this?

brfrn169 · 2025-04-30T13:53:15Z

.../core/src/main/java/com/scalar/db/dataloader/core/dataimport/log/SingleFileImportLogger.java

+      closeLogWriter(summaryLogWriter);
+      closeLogWriter(successLogWriter);
+      closeLogWriter(failureLogWriter);
+      summaryLogWriter = null;


Are successLogWriter and failureLogWriter also reinitialized after being closed?
It looks like they are only initialized once in the constructor, and I don’t see this class being reused after calling closeAllLogWriters().

inv-jishnu · 2025-05-02T12:31:04Z

@brfrn169 san,
I have made a few changes based on feedback and replied to your questions.
PTAL again when you get a chance.
Thank you.

brfrn169

LGTM! Thank you!

ypeckstadt

LGTM. Thank you.

feeblefakie

LGTM! Thank you!

Initial commit

be9ee23

inv-jishnu self-assigned this Apr 11, 2025

inv-jishnu added the enhancement New feature or request label Apr 11, 2025

inv-jishnu marked this pull request as draft April 11, 2025 03:48

inv-jishnu mentioned this pull request Apr 11, 2025

Add import log classes and utils #2463

Closed

6 tasks

inv-jishnu changed the title ~~Add import log module~~ Add import log classes and utils Apr 11, 2025

inv-jishnu and others added 3 commits April 11, 2025 09:27

Spotless applied again

89b9f05

Removed unused code

49c83b6

Merge branch 'master' into feat/data-loader/import-log-2

b2871fb

ypeckstadt marked this pull request as ready for review April 15, 2025 05:32

inv-jishnu and others added 4 commits April 15, 2025 11:45

Removed unused classes and references

c5c9c0a

Merge branch 'master' into feat/data-loader/import-log-2

4964e8d

Merge branch 'master' into feat/data-loader/import-log-2

ff81f5f

Improve Javadocs

3934c2a

ypeckstadt requested review from komamitsu, brfrn169, feeblefakie, Torch3333 and ypeckstadt April 16, 2025 08:00

komamitsu reviewed Apr 17, 2025

View reviewed changes

inv-jishnu added 3 commits April 21, 2025 17:19

Changes

9958f95

Renamed parameters

1afbc21

logging changes

8c5114d

feeblefakie requested a review from Copilot April 22, 2025 01:50

Copilot AI reviewed Apr 22, 2025

View reviewed changes

inv-jishnu and others added 4 commits April 22, 2025 08:51

removed repeated code

ffab395

Merge branch 'master' into feat/data-loader/import-log-2

79df1ed

Merge branch 'master' into feat/data-loader/import-log-2

cf31672

Added excetpion throw

6dd213e

inv-jishnu requested a review from komamitsu April 23, 2025 09:46

Merge branch 'master' into feat/data-loader/import-log-2

9a4bfa8

inv-jishnu mentioned this pull request Apr 30, 2025

Add util classes for data loader CLI #2616

Merged

6 tasks

brfrn169 reviewed Apr 30, 2025

View reviewed changes

inv-jishnu and others added 4 commits May 2, 2025 10:04

Removed null assignment

415805b

Merge branch 'master' into feat/data-loader/import-log-2

0584f5e

comment change

7c72397

renamed params to make it more clear

0f34395

inv-jishnu requested a review from brfrn169 May 2, 2025 12:30

brfrn169 approved these changes May 2, 2025

View reviewed changes

ypeckstadt approved these changes May 7, 2025

View reviewed changes

Merge branch 'master' into feat/data-loader/import-log-2

a59f317

feeblefakie approved these changes May 8, 2025

View reviewed changes

Merge branch 'master' into feat/data-loader/import-log-2

d807387

feeblefakie merged commit 7450430 into master May 8, 2025
51 checks passed

feeblefakie deleted the feat/data-loader/import-log-2 branch May 8, 2025 04:09

feeblefakie pushed a commit that referenced this pull request May 8, 2025

Add import log classes and utils (#2591)

039f285

feeblefakie mentioned this pull request May 8, 2025

Backport to branch(3.15) : Add import log classes and utils #2629

Merged

feeblefakie pushed a commit that referenced this pull request May 8, 2025

Add import log classes and utils (#2591)

b8d46c7

feeblefakie mentioned this pull request May 8, 2025

Backport to branch(3.14) : Add import log classes and utils #2630

Merged

feeblefakie pushed a commit that referenced this pull request May 8, 2025

Add import log classes and utils (#2591)

c3f9978

feeblefakie mentioned this pull request May 8, 2025

Backport to branch(3.13) : Add import log classes and utils #2631

Merged

feeblefakie pushed a commit that referenced this pull request May 8, 2025

Add import log classes and utils (#2591)

4a7fc5a

feeblefakie mentioned this pull request May 8, 2025

Backport to branch(3.12) : Add import log classes and utils #2632

Merged

feeblefakie pushed a commit that referenced this pull request May 8, 2025

Add import log classes and utils (#2591)

262654c

feeblefakie mentioned this pull request May 8, 2025

Backport to branch(3) : Add import log classes and utils #2633

Merged

feeblefakie pushed a commit that referenced this pull request May 8, 2025

Add import log classes and utils (#2591)

40403e0

feeblefakie mentioned this pull request May 8, 2025

Backport to branch(3.11) : Add import log classes and utils #2634

Closed

feeblefakie pushed a commit that referenced this pull request May 8, 2025

Add import log classes and utils (#2591)

6334ace

feeblefakie mentioned this pull request May 8, 2025

Backport to branch(3.10) : Add import log classes and utils #2635

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add import log classes and utils #2591

Add import log classes and utils #2591

inv-jishnu commented Apr 11, 2025 •

edited

Loading

komamitsu Apr 17, 2025

brfrn169 Apr 30, 2025

inv-jishnu May 2, 2025

komamitsu Apr 17, 2025

komamitsu Apr 17, 2025

komamitsu Apr 17, 2025

inv-jishnu Apr 17, 2025

komamitsu Apr 24, 2025

komamitsu Apr 17, 2025

komamitsu Apr 17, 2025

komamitsu Apr 17, 2025

komamitsu Apr 17, 2025

komamitsu Apr 17, 2025

komamitsu Apr 17, 2025

komamitsu Apr 17, 2025

Copilot AI left a comment

Copilot AI Apr 22, 2025

brfrn169 left a comment

brfrn169 Apr 30, 2025

brfrn169 Apr 30, 2025

inv-jishnu commented May 2, 2025

brfrn169 left a comment

ypeckstadt left a comment

feeblefakie left a comment

		private LogWriter successLogWriter;
		private LogWriter failureLogWriter;

	LOGGER.error("Failed to write success/failure logs");
	LOGGER.error("Failed to write success/failure logs", e);

	public void write(JsonNode sourceRecord) throws IOException {
	public void write(@Nullable JsonNode sourceRecord) throws IOException {

		.dataChunkId(taskResult.getDataChunkId())
		.rowNumber(taskResult.getRowNumber());

Add import log classes and utils #2591

Add import log classes and utils #2591

Conversation

inv-jishnu commented Apr 11, 2025 • edited Loading

Description

Related issues and/or PRs

Changes made

Checklist

Additional notes (optional)

Release notes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Copilot AI Apr 22, 2025

Choose a reason for hiding this comment

brfrn169 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

inv-jishnu commented May 2, 2025

brfrn169 left a comment

Choose a reason for hiding this comment

ypeckstadt left a comment

Choose a reason for hiding this comment

feeblefakie left a comment

Choose a reason for hiding this comment

inv-jishnu commented Apr 11, 2025 •

edited

Loading