Skip to content

[Feature] support binlog<row> read and write (row type) (2/3)#63110

Open
Userwhite wants to merge 5 commits intoapache:masterfrom
Userwhite:feature/binlog-write-port
Open

[Feature] support binlog<row> read and write (row type) (2/3)#63110
Userwhite wants to merge 5 commits intoapache:masterfrom
Userwhite:feature/binlog-write-port

Conversation

@Userwhite
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #61956

This PR support write and read path of binlog.

Write Path

  1. TabletsChannel splits the request based on OlapTableSchemaParam and concurrently submits Data and Binlog tasks to the executor via GroupFlushContext.
  2. At the storage layer, data is flushed by SegmentWriter and RowBinlogSegmentWriter respectively. RowBinlogSegmentWriter fills in partial column updates and "Before" data via Historical RetrieveContext.
  3. During the Commit/Publish stages, the delete_bitmap is calculated synchronously.
  4. In the Publish stage, the delete_bitmap inside the current Rowset is copied to binlog_delvec.
  5. Finally, TxnManager uniformly persists the Rowset Meta.
image image

Write Implementation Details

  • Concurrent Flush Model:
    Allows GroupFlushContext to submit both DATA_IN_GROUP and BINLOG_IN_GROUP tasks concurrently for the same SharedMemtable.
  • Reuse delete bitmap:
    Segment ID allocations for both writers are strictly synchronized during the flush phase.
    So binlog_delvec can resue tablet delete_bitmap(seq update, partial update conflict)

Read Path

  1. A query is initiated via TableBinlogFunction, and the READER_BINLOG flag is pushed down along the OlapScanNode to the underlying SegmentIterator.
  2. After iterating through the physical data, the 64-bit Version/LSN and Commit TSO are injected into the result set in real-time via _update_lsn_col_if_needed and _update_tso_col_if_needed.
image

How to do simple test

    CREATE TABLE test_row_binlog_simple (
        k1 INT,
        v1 INT,
        v2 STRING
    )
    UNIQUE KEY(k1)
    DISTRIBUTED BY HASH(k1) BUCKETS 1
    PROPERTIES (
        "replication_num" = "1",
        "enable_unique_key_merge_on_write" = "true",
        "binlog.enable" = "true",
        "binlog.format" = "ROW",
        "binlog.need_historical_value" = "true"
    );

    INSERT INTO test_row_binlog_simple VALUES
        (1, 10, '10'),
        (2, 20, '20');

    UPDATE test_row_binlog_simple
    SET v1 = 11, v2 = '11'
    WHERE k1 = 1;

    SELECT
        __DORIS_BINLOG_LSN__ DIV 18446744073709551616 AS version,
        __DORIS_BINLOG_LSN__ % 18446744073709551616 AS row_id,
        __DORIS_BINLOG_OP__ AS op,
        k1,
        v1,
        v2,
        __BEFORE__v1__,
        __BEFORE__v2__
    FROM binlog("table" = "test_row_binlog_simple")
    ORDER BY __DORIS_BINLOG_LSN__;

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Userwhite
Copy link
Copy Markdown
Contributor Author

/review

@Userwhite
Copy link
Copy Markdown
Contributor Author

run buildall

@Userwhite
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.00% (1844/2364)
Line Coverage 64.67% (33010/51047)
Region Coverage 65.25% (16383/25109)
Branch Coverage 55.76% (8737/15668)

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 66541 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 511688743c45e5a1a8a3a78aefe4a0b7d9abba90, data reload: false

query5	
query6	424	376	373	373
query7	4438	
query8	
query9	
query10	
query11	
query12	348	278	270	270
query13	
query14	
query14_1	
query15	703	680	691	680
query16	1162	821	801	801
query17	
query18	2865	1370	1361	1361
query19	
query20	295	286	296	286
query21	1547	1148	1149	1148
query22	23203	25933	25093	25093
query23	
query23_1	
query24	
query24_1	
query25	
query26	
query27	
query28	
query29	
query30	766	737	722	722
query31	
query32	
query33	
query34	
query35	
query36	
query37	
query38	
query39	1878	1589	1604	1589
query39_1	1553	1473	1564	1473
query40	
query41	150	95	143	95
query42	
query43	
query44	
query45	
query46	
query47	
query48	
query49	
query50	
query51	
query52	
query53	
query54	
query55	
query56	
query57	
query58	
query59	
query60	
query61	
query62	
query63	
query64	
query65	
query66	
query67	
query68	
query69	
query70	
query71	
query72	
query73	
query74	
query75	
query76	
query77	
query78	
query79	
query80	
query81	
query82	
query83	
query84	
query85	
query86	
query87	
query88	
query89	
query90	
query91	
query92	
query93	
query94	
query95	
query96	
query97	
query98	
query99	
Total cold run time: 78263 ms
Total hot run time: 66541 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 71162 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e3a62978f65d893cf500232e94729678c69eaf38, data reload: false

query5	4768	2742	2705	2705
query6	420	365	361	361
query7	4337	1307	1302	1302
query8	663	628	622	622
query9	25754	
query10	
query11	
query12	399	359	342	342
query13	
query14	
query14_1	
query15	
query16	
query17	
query18	
query19	
query20	
query21	1850	1407	1320	1320
query22	26205	24338	22465	22465
query23	
query23_1	
query24	
query24_1	
query25	
query26	
query27	
query28	
query29	
query30	750	744	750	744
query31	
query32	
query33	
query34	
query35	
query36	
query37	
query38	
query39	1954	1671	1648	1648
query39_1	1597	1567	1539	1539
query40	523	448	454	448
query41	147	99	99	99
query42	
query43	
query44	
query45	
query46	
query47	
query48	
query49	
query50	
query51	
query52	
query53	
query54	
query55	
query56	
query57	3362	3287	3283	3283
query58	
query59	
query60	
query61	
query62	
query63	
query64	
query65	
query66	
query67	
query68	
query69	
query70	
query71	
query72	
query73	
query74	
query75	
query76	
query77	
query78	
query79	
query80	
query81	1134	1104	1102	1102
query82	
query83	
query84	333	195	196	195
query85	
query86	
query87	
query88	
query89	
query90	
query91	285	257	252	252
query92	
query93	
query94	
query95	
query96	
query97	
query98	
query99	
Total cold run time: 92121 ms
Total hot run time: 71162 ms

@Userwhite
Copy link
Copy Markdown
Contributor Author

run buildall

@Userwhite
Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.00% (1844/2364)
Line Coverage 64.61% (32982/51047)
Region Coverage 65.20% (16370/25109)
Branch Coverage 55.74% (8734/15668)

@hello-stephen
Copy link
Copy Markdown
Contributor

Cloud UT Coverage Report

Increment line coverage 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 78.00% (1844/2364)
Line Coverage 64.69% (33020/51047)
Region Coverage 65.27% (16388/25109)
Branch Coverage 55.80% (8743/15668)

@hello-stephen
Copy link
Copy Markdown
Contributor

FE UT Coverage Report

Increment line coverage 12.08% (18/149) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 70442 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e3f6fdce796a83104fde12695a02840f2187f1b0, data reload: false

query5	4451	2694	2674	2674
query6	424	385	374	374
query7	4354	1264	1262	1262
query8	672	638	622	622
query9	25362	
query10	
query11	
query12	362	294	302	294
query13	
query14	
query14_1	
query15	
query16	
query17	
query18	
query19	
query20	
query21	1868	1386	1435	1386
query22	26666	24554	22518	22518
query23	
query23_1	
query24	
query24_1	
query25	
query26	
query27	
query28	
query29	
query30	741	725	721	721
query31	
query32	
query33	
query34	
query35	
query36	
query37	
query38	
query39	1946	1612	1587	1587
query39_1	1593	1578	1575	1575
query40	496	430	430	430
query41	143	97	98	97
query42	
query43	
query44	
query45	
query46	
query47	
query48	
query49	
query50	
query51	
query52	
query53	
query54	
query55	
query56	
query57	3306	3227	3254	3227
query58	
query59	
query60	
query61	
query62	
query63	
query64	
query65	
query66	
query67	
query68	
query69	
query70	
query71	
query72	
query73	
query74	
query75	
query76	
query77	
query78	
query79	
query80	
query81	1113	1089	1094	1089
query82	
query83	
query84	323	196	198	196
query85	
query86	
query87	
query88	
query89	
query90	
query91	280	259	251	251
query92	
query93	
query94	
query95	
query96	
query97	
query98	
query99	
Total cold run time: 91374 ms
Total hot run time: 70442 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 57471 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5f5a2d0b46006d08bae44ce6dceab1c868a80032, data reload: false

query5	5801	1530	1387	1387
query6	371	263	251	251
query7	4285	662	506	506
query8	605	557	564	557
query9	13888	13199	
query10	
query11	
query12	329	238	247	238
query13	
query14	
query14_1	
query15	
query16	
query17	
query18	
query19	
query20	
query21	940	694	639	639
query22	24279	25696	23050	23050
query23	
query23_1	
query24	
query24_1	
query25	
query26	
query27	
query28	
query29	
query30	676	668	637	637
query31	
query32	
query33	
query34	
query35	
query36	
query37	
query38	
query39	1433	1377	1371	1371
query39_1	1329	1320	1308	1308
query40	
query41	151	98	91	91
query42	
query43	
query44	
query45	
query46	
query47	
query48	
query49	
query50	
query51	
query52	
query53	
query54	
query55	
query56	
query57	
query58	
query59	
query60	
query61	
query62	
query63	
query64	
query65	
query66	
query67	
query68	
query69	
query70	
query71	
query72	
query73	
query74	
query75	
query76	
query77	
query78	
query79	
query80	
query81	
query82	
query83	
query84	
query85	
query86	
query87	
query88	
query89	
query90	
query91	
query92	
query93	
query94	
query95	
query96	
query97	
query98	
query99	
Total cold run time: 80618 ms
Total hot run time: 57471 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] add row type for doris binlog

2 participants