Skip to content

Tidy up Bulk Copy unmatched column name work #3205

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 25, 2025

Conversation

paulmedynski
Copy link
Contributor

@paulmedynski paulmedynski commented Mar 6, 2025

  • Simplified the tracking of unmatched columns.
  • Cleaned up some fragile flags and loops.

The diffs look worse than they are. Disabling whitespace helps.

@paulmedynski paulmedynski requested review from a team and Copilot March 6, 2025 19:05
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR tidies up the unmatched column name handling in Bulk Copy by simplifying the tracking of unmatched columns and cleaning up fragile flags and loops. Key changes include renaming and refactoring the transaction check variable, replacing a dictionary with a HashSet to track unmatched columns, and updating error method names for consistency.

Reviewed Changes

File Description
src/Microsoft.Data.SqlClient/netfx/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs Refactored transaction flag naming and column matching logic with a new HashSet-based approach
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs Similar refactoring of transaction and column matching logic for .NET Core
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlUtil.cs Updated method overload names for non‐matching column errors for consistency

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (2)

src/Microsoft.Data.SqlClient/netfx/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs:586

  • [nitpick] Consider initializing 'unmatchedColumns' with an initial capacity (e.g. _localColumnMappings.Count) for consistency with the .NET Core version and minor performance improvement.
HashSet<string> unmatchedColumns = new();

src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlUtil.cs:1260

  • [nitpick] Ensure that the rename from 'BulkLoadNonMatchingColumnName' to 'BulkLoadNonMatchingColumnNames' aligns with the rest of the codebase to maintain consistency.
internal static Exception BulkLoadNonMatchingColumnNames(IEnumerable<string> columnNames)

Copy link

codecov bot commented Mar 6, 2025

Codecov Report

Attention: Patch coverage is 89.28571% with 15 lines in your changes missing coverage. Please review.

Project coverage is 72.60%. Comparing base (4e04631) to head (d545c24).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
...etcore/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs 88.57% 8 Missing ⚠️
.../netfx/src/Microsoft/Data/SqlClient/SqlBulkCopy.cs 89.85% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3205      +/-   ##
==========================================
- Coverage   72.73%   72.60%   -0.14%     
==========================================
  Files         288      282       -6     
  Lines       59527    59212     -315     
==========================================
- Hits        43299    42988     -311     
+ Misses      16228    16224       -4     
Flag Coverage Δ
addons ?
netcore 75.15% <88.73%> (+0.02%) ⬆️
netfx 71.35% <90.00%> (-0.07%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@benrr101 benrr101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate adding the comments to explain what's happening 👍 Overall looks good, couple stylistic changes, but nothing huge.

Copy link
Contributor Author

@paulmedynski paulmedynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One bulk copy test was failing on .NET Framework, which resulted in some unrelated changes to test helpers playing fast and loose with reflection.

@paulmedynski paulmedynski added this to the 7.0-preview1 milestone Mar 7, 2025
@paulmedynski paulmedynski requested a review from a team March 14, 2025 14:34
- Simplified the tracking of unmatched columns.
- Cleaned up some fragile flags and loops.
- Fixed SNI reflection failures when running in NetFX.
@paulmedynski paulmedynski force-pushed the dev/paul/bulk-copy-tidy branch from 253cbb2 to f735aee Compare March 18, 2025 13:34
- Renamed flag in NetCore file to match NetFX implementation.
@cheenamalhotra
Copy link
Member

Is there any performance difference with the changes? We need to ensure we don't regress anything if not improving.. can you run some benchmarks and share the results?

@paulmedynski
Copy link
Contributor Author

Benchmark results comparing main to these changes, running on my dev laptop, don't appear to indicate any performance changes.

Main

// * Summary *

BenchmarkDotNet v0.14.0, Debian GNU/Linux 12 (bookworm)
AMD Ryzen 7 PRO 8840HS w/ Radeon 780M Graphics, 1 CPU, 16 logical and 8 physical cores
.NET SDK 9.0.201
  [Host] : .NET 9.0.3 (9.0.325.11113), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI

Job=MediumRun  Toolchain=InProcessEmitToolchain  InvocationCount=1
IterationCount=10  LaunchCount=1  RunStrategy=Throughput
UnrollFactor=1  WarmupCount=1

| Method                      | Columns | Mean     | Error    | StdDev   | Gen0       | Completed Work Items | Lock Contentions | Gen1      | Allocated |
|---------------------------- |-------- |---------:|---------:|---------:|-----------:|---------------------:|-----------------:|----------:|----------:|
| BulkCopy_IDataReader        | 7       | 183.4 ms | 22.06 ms | 14.59 ms |          - |             105.0000 |                - |         - |   1.56 MB |
| BulkCopyAsync_IDataReader   | 7       | 206.2 ms | 16.96 ms | 10.09 ms |          - |           20094.0000 |                - |         - |   6.99 MB |
| BulkCopy_SqlDataReader      | 7       | 209.9 ms | 19.17 ms | 11.41 ms | 23000.0000 |             192.0000 |                - | 1000.0000 | 184.65 MB |
| BulkCopyAsync_SqlDataReader | 7       | 231.5 ms | 19.76 ms | 13.07 ms | 24000.0000 |           20887.0000 |                - | 1000.0000 | 194.08 MB |
| BulkCopy_IDataReader        | 25      | 286.9 ms | 16.75 ms |  9.97 ms |          - |             269.0000 |                - |         - |   6.46 MB |
| BulkCopyAsync_IDataReader   | 25      | 321.3 ms | 21.40 ms | 14.15 ms |  1000.0000 |           20201.0000 |                - | 1000.0000 |   11.9 MB |
| BulkCopy_SqlDataReader      | 25      | 302.1 ms | 21.52 ms | 12.81 ms | 24000.0000 |             312.0000 |                - | 1000.0000 | 194.47 MB |
| BulkCopyAsync_SqlDataReader | 25      | 351.0 ms | 32.73 ms | 19.47 ms | 26000.0000 |           21644.0000 |                - | 1000.0000 | 205.65 MB |
| BulkCopy_IDataReader        | 50      | 686.1 ms | 88.47 ms | 58.52 ms |  1000.0000 |             606.0000 |                - | 1000.0000 |  12.74 MB |
| BulkCopyAsync_IDataReader   | 50      | 737.6 ms | 88.63 ms | 52.74 ms |  2000.0000 |           20521.0000 |                - | 1000.0000 |   18.2 MB |
| BulkCopy_SqlDataReader      | 50      | 661.6 ms | 64.07 ms | 42.38 ms | 48000.0000 |             491.0000 |                - | 1000.0000 | 388.44 MB |
| BulkCopyAsync_SqlDataReader | 50      | 735.8 ms | 61.35 ms | 36.51 ms | 51000.0000 |           23283.0000 |                - | 1000.0000 | 404.14 MB |

// * Hints *
Outliers
  SqlBulkCopyRunner.BulkCopyAsync_IDataReader: MediumRun   -> 1 outlier  was  removed (276.25 ms)
  SqlBulkCopyRunner.BulkCopy_SqlDataReader: MediumRun      -> 1 outlier  was  removed (264.93 ms)
  SqlBulkCopyRunner.BulkCopy_IDataReader: MediumRun        -> 1 outlier  was  removed (352.42 ms)
  SqlBulkCopyRunner.BulkCopy_SqlDataReader: MediumRun      -> 1 outlier  was  removed (390.07 ms)
  SqlBulkCopyRunner.BulkCopyAsync_SqlDataReader: MediumRun -> 1 outlier  was  removed (434.54 ms)
  SqlBulkCopyRunner.BulkCopyAsync_IDataReader: MediumRun   -> 1 outlier  was  removed, 3 outliers were detected (646.87 ms, 656.18 ms, 962.87 ms)
  SqlBulkCopyRunner.BulkCopyAsync_SqlDataReader: MediumRun -> 1 outlier  was  removed (1.05 s)
// * Config Issues *

// * Warnings *
Configuration
  Summary -> The exporter MarkdownExporter-github is already present in configuration. There may be unexpected results.

// * Legends *
  Columns              : Value of the 'Columns' parameter
  Mean                 : Arithmetic mean of all measurements
  Error                : Half of 99.9% confidence interval
  StdDev               : Standard deviation of all measurements
  Gen0                 : GC Generation 0 collects per 1000 operations
  Completed Work Items : The number of work items that have been processed in ThreadPool (per single operation)
  Lock Contentions     : The number of times there was contention upon trying to take a Monitor's lock (per single operation)
  Gen1                 : GC Generation 1 collects per 1000 operations
  Allocated            : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
  1 ms                 : 1 Millisecond (0.001 sec)

// * Diagnostic Output - MemoryDiagnoser *

// * Diagnostic Output - ThreadingDiagnoser *


// ***** BenchmarkRunner: End *****
Run time: 00:01:17 (77.76 sec), executed benchmarks: 12

Global total time: 00:01:18 (78.04 sec), executed benchmarks: 12

These Changes

// * Summary *

BenchmarkDotNet v0.14.0, Debian GNU/Linux 12 (bookworm)
AMD Ryzen 7 PRO 8840HS w/ Radeon 780M Graphics, 1 CPU, 16 logical and 8 physical cores
.NET SDK 9.0.201
  [Host] : .NET 9.0.3 (9.0.325.11113), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI

Job=MediumRun  Toolchain=InProcessEmitToolchain  InvocationCount=1
IterationCount=10  LaunchCount=1  RunStrategy=Throughput
UnrollFactor=1  WarmupCount=1

| Method                      | Columns | Mean     | Error    | StdDev   | Gen0       | Completed Work Items | Lock Contentions | Gen1      | Allocated |
|---------------------------- |-------- |---------:|---------:|---------:|-----------:|---------------------:|-----------------:|----------:|----------:|
| BulkCopy_IDataReader        | 7       | 177.3 ms | 10.19 ms |  5.33 ms |          - |             201.0000 |                - |         - |   1.56 MB |
| BulkCopyAsync_IDataReader   | 7       | 216.2 ms | 24.60 ms | 16.27 ms |          - |           20242.0000 |                - |         - |   6.99 MB |
| BulkCopy_SqlDataReader      | 7       | 225.7 ms | 33.26 ms | 22.00 ms | 23000.0000 |             186.0000 |                - | 1000.0000 | 184.65 MB |
| BulkCopyAsync_SqlDataReader | 7       | 233.9 ms | 16.16 ms | 10.69 ms | 24000.0000 |           20886.0000 |                - | 1000.0000 | 194.06 MB |
| BulkCopy_IDataReader        | 25      | 292.7 ms | 19.26 ms | 12.74 ms |          - |             425.0000 |                - |         - |   6.46 MB |
| BulkCopyAsync_IDataReader   | 25      | 303.5 ms | 14.01 ms |  7.33 ms |  1000.0000 |           20441.0000 |                - | 1000.0000 |   11.9 MB |
| BulkCopy_SqlDataReader      | 25      | 312.0 ms | 32.09 ms | 19.10 ms | 24000.0000 |             326.0000 |                - | 1000.0000 | 194.47 MB |
| BulkCopyAsync_SqlDataReader | 25      | 356.9 ms | 29.07 ms | 19.23 ms | 26000.0000 |           21642.0000 |                - | 1000.0000 | 205.65 MB |
| BulkCopy_IDataReader        | 50      | 637.7 ms | 57.49 ms | 38.03 ms |  1000.0000 |             628.0000 |                - | 1000.0000 |  12.74 MB |
| BulkCopyAsync_IDataReader   | 50      | 684.0 ms | 51.83 ms | 34.28 ms |  2000.0000 |           20601.0000 |                - | 1000.0000 |   18.2 MB |
| BulkCopy_SqlDataReader      | 50      | 666.7 ms | 86.51 ms | 51.48 ms | 48000.0000 |             622.0000 |                - | 1000.0000 | 388.44 MB |
| BulkCopyAsync_SqlDataReader | 50      | 792.1 ms | 71.73 ms | 47.45 ms | 50000.0000 |           23163.0000 |                - | 1000.0000 | 404.12 MB |

// * Hints *
Outliers
  SqlBulkCopyRunner.BulkCopy_IDataReader: MediumRun      -> 2 outliers were removed (206.65 ms, 212.79 ms)
  SqlBulkCopyRunner.BulkCopyAsync_IDataReader: MediumRun -> 2 outliers were removed (350.23 ms, 350.38 ms)
  SqlBulkCopyRunner.BulkCopy_SqlDataReader: MediumRun    -> 1 outlier  was  removed (382.03 ms)
  SqlBulkCopyRunner.BulkCopy_SqlDataReader: MediumRun    -> 1 outlier  was  removed (853.82 ms)
// * Config Issues *

// * Warnings *
Configuration
  Summary -> The exporter MarkdownExporter-github is already present in configuration. There may be unexpected results.

// * Legends *
  Columns              : Value of the 'Columns' parameter
  Mean                 : Arithmetic mean of all measurements
  Error                : Half of 99.9% confidence interval
  StdDev               : Standard deviation of all measurements
  Gen0                 : GC Generation 0 collects per 1000 operations
  Completed Work Items : The number of work items that have been processed in ThreadPool (per single operation)
  Lock Contentions     : The number of times there was contention upon trying to take a Monitor's lock (per single operation)
  Gen1                 : GC Generation 1 collects per 1000 operations
  Allocated            : Allocated memory per single operation (managed only, inclusive, 1KB = 1024B)
  1 ms                 : 1 Millisecond (0.001 sec)

// * Diagnostic Output - MemoryDiagnoser *

// * Diagnostic Output - ThreadingDiagnoser *


// ***** BenchmarkRunner: End *****
Run time: 00:01:17 (77.86 sec), executed benchmarks: 12

Global total time: 00:01:18 (78.19 sec), executed benchmarks: 12

@paulmedynski paulmedynski merged commit ebcc63a into main Mar 25, 2025
252 checks passed
@paulmedynski paulmedynski deleted the dev/paul/bulk-copy-tidy branch March 25, 2025 19:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants