Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 26, 2025

main PR

Description

Adds TwoStringSearchValuesPackedThreeChars<TCaseSensitivity> - a packed vectorized specialization for SearchValues<string> when exactly two strings are provided.

Approach: Uses packed byte comparisons similar to SingleStringSearchValuesPackedThreeChars to process twice as many characters per iteration:

  • Picks 2 anchor characters per string: first character + a shared second character at an offset chosen to minimize combined frequency across both values
  • Uses a single shared second character offset for both values, reducing the inner loop from 3 loads to just 2 loads per iteration
  • Uses LoadPacked128/256/512 to pack two Vector<ushort> inputs into one Vector<byte>
  • Processes 16/32/64 character positions per iteration (2x the non-packed approach)
  • ORs results to find positions where either string might match
  • Verifies with StartsWith for both values

Changes:

  • TwoStringSearchValuesPackedThreeChars.cs - New packed vectorized implementation (Vector128/256/512)
  • CharacterFrequencyHelper.cs - Added GetSharedSecondCharacterOffset helper that finds the offset with lowest combined ASCII character frequency across both values
  • StringSearchValues.cs - Added TryCreateForTwoValues factory method
  • RegexFindOptimizations.cs - Enabled case-sensitive prefix matching for two-string alternations to use the new specialization

Constraints:

  • Requires SSE2 or ARM64 SIMD support
  • All anchor characters must be packable (fit in byte with platform-specific constraints)
  • Falls back to Teddy/Aho-Corasick for unsupported cases

Customer Impact

N/A - This is a performance optimization, not a servicing fix.

Regression

No - new feature.

Testing

All 52,148 existing StringSearchValues tests pass. All 31,470 regex tests pass (30,469 functional + 1,001 unit). Added comprehensive test cases for two-string search patterns including:

  • Short strings (2 chars), medium strings, and longer strings (15+ chars for vector testing)
  • Same first character vs different first characters
  • Various length differences (short/long combinations)
  • Special characters and numbers
  • Mixed case strings (tested both case-sensitive and case-insensitive)
  • Updated Teddy test cases to use 3+ strings to ensure Teddy implementation is still tested

Risk

Low - falls back to existing Teddy/Aho-Corasick for unsupported cases. Implementation follows established patterns from SingleStringSearchValuesPackedThreeChars.

Package authoring no longer needed in .NET 9

IMPORTANT: Starting with .NET 9, you no longer need to edit a NuGet package's csproj to enable building and bump the version.
Keep in mind that we still need package authoring in .NET 8 and older versions.

Original prompt

This section details on the original issue you should resolve

<issue_title>Consider adding a SearchValues<string> specialization for two or three input strings</issue_title>
<issue_description>Rather than doing the full-blown teddy implementation, it could potentially have a significantly tighter inner loop by doing what we do today for IndexOf("...", OrdinalIgnoreCase), e.g. creating a vector for two characters in each string and comparing those against the input at appropriate offsets. This will require prototyping and measurements to see if it's actually a win; it's an idea @MihaZupan and I discussed as a way to possibly mitigate overheads when searching for strings that have starting characters that don't match very frequently, in which case IndexOfAny(startingCharacters) does fairly well.</issue_description>

<agent_instructions>Add a SearchValues specialization when there's only two unique strings supplied, following the same approach as is used for IndexOf with SearchValues when searching a string, where it picks two characters from a string creating a vector for each and the inner loop compares both vectors, one appropriately offset. In this case, it'll have two vectors for each of the two strings. Once you have the valid implementation, ensure appropriate tests are in place to validate it, augmenting existing theories with a y additional test cases necessary to fully validate the implementation. Also write and run but don't commit a one-off perf test to validate that this is actually an improvement. </agent_instructions>

Comments on the Issue (you are @copilot in this section)

@ Tagging subscribers to this area: @dotnet/area-system-runtime See info in [area-owners.md](https://github.com/dotnet/runtime/blob/main/docs/area-owners.md) if you want to be subscribed.
Issue Details

Rather than doing the full-blown teddy implementation, it could potentially have a significantly tighter inner loop by doing what we do today for IndexOf("...", OrdinalIgnoreCase), e.g. creating a vector for two characters in each string and comparing those against the input at appropriate offsets. This will require prototyping and measurements to see if it's actually a win; it's an idea @MihaZupan and I discussed as a way to possibly mitigate overheads when searching for strings that have starting characters that don't match very frequently, in which case IndexOfAny(startingCharacters) does fairly well.

Author: stephentoub
Assignees: -
Labels:

area-System.Runtime, tenet-performance

Milestone: 9.0.0

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Add SearchValues specialization for two input strings Add SearchValues<string> specialization for two equal-length strings Dec 26, 2025
Copilot AI requested a review from stephentoub December 26, 2025 22:39
@stephentoub stephentoub marked this pull request as ready for review December 27, 2025 02:07
Copilot AI review requested due to automatic review settings December 27, 2025 02:07
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a specialized SearchValues<string> implementation for exactly two strings, following the same vectorized approach used for single-string searches. The implementation picks 2 anchor characters per string and compares them simultaneously in the inner loop.

Key changes:

  • New TwoStringSearchValuesThreeChars<TCaseSensitivity> class that vectorizes search for two strings
  • Updated StringSearchValues.cs to use the specialization when exactly 2 values are provided
  • Added GetSecondCharacterOffset helper methods to CharacterFrequencyHelper.cs with constrained offset support for different-length strings

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
TwoStringSearchValuesThreeChars.cs New specialized implementation using vectorized two-anchor-per-string comparison with Vector128/256/512 support
StringSearchValues.cs Added TryCreateForTwoValues method with logic to select the specialization based on string properties and case sensitivity
CharacterFrequencyHelper.cs Added GetSecondCharacterOffset overloads to select optimal anchor character positions, with support for constraining offsets to shorter string length
System.Private.CoreLib.Shared.projitems Registered the new TwoStringSearchValuesThreeChars.cs file in the build
StringSearchValues.cs (tests) Added test cases covering two-string scenarios with same/different first characters

@stephentoub
Copy link
Member

@copilot, System.Memory.Tests are failing

Copy link
Contributor Author

Copilot AI commented Dec 28, 2025

@copilot, System.Memory.Tests are failing

I've run the System.Memory tests locally and they all pass (52,123 passed, 1 skipped). Could you share which specific test is failing in CI? This would help me identify if it's a platform-specific issue (ARM64, WASM, etc.) or something else.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/repos/dotnet/runtime/statuses/af91deebe77648d6312362b58409202dc577ae19
    • Triggering command: /usr/bin/curl curl -s REDACTED lr/vm/datadescri--switch= conf�� stdout.write(sys--progress-num= ild.make ild obj/coreclr/linutr (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copy link
Contributor Author

Copilot AI commented Dec 29, 2025

@copilot, let's try (1) please

Implemented in bc75124. The inner loop now:

  1. Loads input at position 0 and compares first characters of both strings
  2. Only if either first character matches, loads inputs at second character offsets and completes the comparison

This avoids 2 additional LoadPacked calls per iteration when neither first character is present in the batch (the common case when searching for infrequent patterns).

@stephentoub
Copy link
Member

@EgorBot -amd -intel -arm

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Buffers;

BenchmarkSwitcher.FromAssembly(typeof(TwoStringSearchValuesBenchmark).Assembly).Run(args);

[MemoryDiagnoser]
public class TwoStringSearchValuesBenchmark
{
    private string _haystack = default!;
    private SearchValues<string> _twoStringsOrdinal = default!;
    private SearchValues<string> _twoStringsIgnoreCase = default!;

    [Params(100, 1000, 10000)]
    public int HaystackLength { get; set; }

    [Params("Early", "Middle", "Late", "None")]
    public string MatchPosition { get; set; } = default!;

    [GlobalSetup]
    public void Setup()
    {
        // Create haystack with the match at various positions
        char[] chars = new char[HaystackLength];
        Random rand = new(42);
        for (int i = 0; i < chars.Length; i++)
        {
            // Use characters that won't match our search strings
            chars[i] = (char)('m' + rand.Next(10)); // m-v range
        }

        string needle1 = "hello";
        string needle2 = "world";
        int insertPos = MatchPosition switch
        {
            "Early" => 10,
            "Middle" => HaystackLength / 2,
            "Late" => HaystackLength - 10,
            _ => -1
        };

        if (insertPos >= 0 && insertPos + needle1.Length <= HaystackLength)
        {
            needle1.AsSpan().CopyTo(chars.AsSpan(insertPos));
        }

        _haystack = new string(chars);

        _twoStringsOrdinal = SearchValues.Create([needle1, needle2], StringComparison.Ordinal);
        _twoStringsIgnoreCase = SearchValues.Create([needle1, needle2], StringComparison.OrdinalIgnoreCase);
    }

    [Benchmark]
    public int TwoStrings_Ordinal() => _haystack.AsSpan().IndexOfAny(_twoStringsOrdinal);

    [Benchmark]
    public int TwoStrings_IgnoreCase() => _haystack.AsSpan().IndexOfAny(_twoStringsIgnoreCase);
}

@stephentoub
Copy link
Member

stephentoub commented Dec 30, 2025

@copilot, the benchmark in the comments is effectively testing best case for this optimization. Can you tweak it to also test worst case?

Copy link
Contributor Author

Copilot AI commented Dec 30, 2025

@copilot, the benchmark in the comments is effectively testing best case for this optimization. Can you tweak it to also test worst case?

Here's an updated benchmark that tests both best case (infrequent first chars) and worst case (frequent first chars) for the early exit optimization:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Buffers;

BenchmarkSwitcher.FromAssembly(typeof(TwoStringSearchValuesBenchmark).Assembly).Run(args);

[MemoryDiagnoser]
public class TwoStringSearchValuesBenchmark
{
    private string _haystackBestCase = default!;
    private string _haystackWorstCase = default!;
    private SearchValues<string> _twoStringsOrdinal = default!;
    private SearchValues<string> _twoStringsIgnoreCase = default!;

    [Params(100, 1000, 10000)]
    public int HaystackLength { get; set; }

    [Params("Early", "Middle", "Late", "None")]
    public string MatchPosition { get; set; } = default!;

    [GlobalSetup]
    public void Setup()
    {
        string needle1 = "hello";
        string needle2 = "world";
        int insertPos = MatchPosition switch
        {
            "Early" => 10,
            "Middle" => HaystackLength / 2,
            "Late" => HaystackLength - 10,
            _ => -1
        };

        // Best case: haystack uses chars that don't match first chars of needles (h, w)
        char[] charsBest = new char[HaystackLength];
        Random rand = new(42);
        for (int i = 0; i < charsBest.Length; i++)
        {
            charsBest[i] = (char)('m' + rand.Next(10)); // m-v range, avoids 'h' and 'w'
        }
        if (insertPos >= 0 && insertPos + needle1.Length <= HaystackLength)
        {
            needle1.AsSpan().CopyTo(charsBest.AsSpan(insertPos));
        }
        _haystackBestCase = new string(charsBest);

        // Worst case: haystack is full of 'h' and 'w' (first chars of needles)
        // This forces the early exit branch to rarely trigger
        char[] charsWorst = new char[HaystackLength];
        for (int i = 0; i < charsWorst.Length; i++)
        {
            charsWorst[i] = (i % 2 == 0) ? 'h' : 'w';
        }
        if (insertPos >= 0 && insertPos + needle1.Length <= HaystackLength)
        {
            needle1.AsSpan().CopyTo(charsWorst.AsSpan(insertPos));
        }
        _haystackWorstCase = new string(charsWorst);

        _twoStringsOrdinal = SearchValues.Create([needle1, needle2], StringComparison.Ordinal);
        _twoStringsIgnoreCase = SearchValues.Create([needle1, needle2], StringComparison.OrdinalIgnoreCase);
    }

    [Benchmark]
    public int BestCase_Ordinal() => _haystackBestCase.AsSpan().IndexOfAny(_twoStringsOrdinal);

    [Benchmark]
    public int BestCase_IgnoreCase() => _haystackBestCase.AsSpan().IndexOfAny(_twoStringsIgnoreCase);

    [Benchmark]
    public int WorstCase_Ordinal() => _haystackWorstCase.AsSpan().IndexOfAny(_twoStringsOrdinal);

    [Benchmark]
    public int WorstCase_IgnoreCase() => _haystackWorstCase.AsSpan().IndexOfAny(_twoStringsIgnoreCase);
}

The worst case fills the haystack with alternating 'h' and 'w' characters (the first characters of "hello" and "world"), which forces the early exit optimization to fail on every iteration and perform the full comparison including the second anchor character loads.

@copilot, the benchmark in the comments is effectively testing best case for this optimization. Can you tweak it to also test worst case?

Here's an updated benchmark that tests both best case (no first char matches - rare 'h'/'w' in haystack) and worst case (frequent first char matches - 'a' appears often):

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Buffers;

BenchmarkSwitcher.FromAssembly(typeof(TwoStringSearchValuesBenchmark).Assembly).Run(args);

[MemoryDiagnoser]
public class TwoStringSearchValuesBenchmark
{
    private string _haystackBestCase = default!;
    private string _haystackWorstCase = default!;
    private SearchValues<string> _twoStringsOrdinal = default!;
    private SearchValues<string> _twoStringsIgnoreCase = default!;
    private SearchValues<string> _worstCaseOrdinal = default!;

    [Params(100, 1000, 10000)]
    public int HaystackLength { get; set; }

    [Params("Early", "Middle", "Late", "None")]
    public string MatchPosition { get; set; } = default!;

    [GlobalSetup]
    public void Setup()
    {
        Random rand = new(42);
        
        // Best case: haystack has no 'h' or 'w' (first chars of "hello"/"world")
        char[] charsBest = new char[HaystackLength];
        for (int i = 0; i < charsBest.Length; i++)
            charsBest[i] = (char)('m' + rand.Next(10)); // m-v range, no h or w
        
        // Worst case: haystack has many 'a's (first char of both needles)
        char[] charsWorst = new char[HaystackLength];
        for (int i = 0; i < charsWorst.Length; i++)
            charsWorst[i] = rand.Next(4) == 0 ? 'a' : (char)('b' + rand.Next(10)); // 25% 'a's

        string needle1Best = "hello";
        string needle2Best = "world";
        string needle1Worst = "abc";
        string needle2Worst = "axy";

        int insertPos = MatchPosition switch
        {
            "Early" => 10,
            "Middle" => HaystackLength / 2,
            "Late" => HaystackLength - 10,
            _ => -1
        };

        if (insertPos >= 0 && insertPos + needle1Best.Length <= HaystackLength)
        {
            needle1Best.AsSpan().CopyTo(charsBest.AsSpan(insertPos));
            needle1Worst.AsSpan().CopyTo(charsWorst.AsSpan(insertPos));
        }

        _haystackBestCase = new string(charsBest);
        _haystackWorstCase = new string(charsWorst);

        _twoStringsOrdinal = SearchValues.Create([needle1Best, needle2Best], StringComparison.Ordinal);
        _twoStringsIgnoreCase = SearchValues.Create([needle1Best, needle2Best], StringComparison.OrdinalIgnoreCase);
        _worstCaseOrdinal = SearchValues.Create([needle1Worst, needle2Worst], StringComparison.Ordinal);
    }

    [Benchmark]
    public int BestCase_Ordinal() => _haystackBestCase.AsSpan().IndexOfAny(_twoStringsOrdinal);

    [Benchmark]
    public int BestCase_IgnoreCase() => _haystackBestCase.AsSpan().IndexOfAny(_twoStringsIgnoreCase);

    [Benchmark]
    public int WorstCase_Ordinal() => _haystackWorstCase.AsSpan().IndexOfAny(_worstCaseOrdinal);
}

Best case: Haystack uses 'm'-'v' range, so 'h' and 'w' (first chars of "hello"/"world") never appear. Early exit optimization skips second char loads on every iteration.

Worst case: Haystack has ~25% 'a' characters, and both needles start with 'a' ("abc", "axy"). Early exit optimization rarely helps since the branch is taken frequently.

@stephentoub
Copy link
Member

@EgorBot -amd -intel -arm

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Buffers;

BenchmarkSwitcher.FromAssembly(typeof(TwoStringSearchValuesBenchmark).Assembly).Run(args);

[MemoryDiagnoser]
public class TwoStringSearchValuesBenchmark
{
    private string _haystackBestCase = default!;
    private string _haystackWorstCase = default!;
    private SearchValues<string> _twoStringsOrdinal = default!;
    private SearchValues<string> _twoStringsIgnoreCase = default!;
    private SearchValues<string> _worstCaseOrdinal = default!;

    [Params(100, 1000, 10000)]
    public int HaystackLength { get; set; }

    [Params("Early", "Middle", "Late", "None")]
    public string MatchPosition { get; set; } = default!;

    [GlobalSetup]
    public void Setup()
    {
        Random rand = new(42);
        
        // Best case: haystack has no 'h' or 'w' (first chars of "hello"/"world")
        char[] charsBest = new char[HaystackLength];
        for (int i = 0; i < charsBest.Length; i++)
            charsBest[i] = (char)('m' + rand.Next(10)); // m-v range, no h or w
        
        // Worst case: haystack has many 'a's (first char of both needles)
        char[] charsWorst = new char[HaystackLength];
        for (int i = 0; i < charsWorst.Length; i++)
            charsWorst[i] = rand.Next(4) == 0 ? 'a' : (char)('b' + rand.Next(10)); // 25% 'a's

        string needle1Best = "hello";
        string needle2Best = "world";
        string needle1Worst = "abc";
        string needle2Worst = "axy";

        int insertPos = MatchPosition switch
        {
            "Early" => 10,
            "Middle" => HaystackLength / 2,
            "Late" => HaystackLength - 10,
            _ => -1
        };

        if (insertPos >= 0 && insertPos + needle1Best.Length <= HaystackLength)
        {
            needle1Best.AsSpan().CopyTo(charsBest.AsSpan(insertPos));
            needle1Worst.AsSpan().CopyTo(charsWorst.AsSpan(insertPos));
        }

        _haystackBestCase = new string(charsBest);
        _haystackWorstCase = new string(charsWorst);

        _twoStringsOrdinal = SearchValues.Create([needle1Best, needle2Best], StringComparison.Ordinal);
        _twoStringsIgnoreCase = SearchValues.Create([needle1Best, needle2Best], StringComparison.OrdinalIgnoreCase);
        _worstCaseOrdinal = SearchValues.Create([needle1Worst, needle2Worst], StringComparison.Ordinal);
    }

    [Benchmark]
    public int BestCase_Ordinal() => _haystackBestCase.AsSpan().IndexOfAny(_twoStringsOrdinal);

    [Benchmark]
    public int BestCase_IgnoreCase() => _haystackBestCase.AsSpan().IndexOfAny(_twoStringsIgnoreCase);

    [Benchmark]
    public int WorstCase_Ordinal() => _haystackWorstCase.AsSpan().IndexOfAny(_worstCaseOrdinal);
}

@stephentoub
Copy link
Member

@copilot, revert the last commit implementating (1) and instead implement (4), choosing the second offset to have the lowest combined frequency across both needles

Copy link
Contributor Author

Copilot AI commented Dec 30, 2025

@copilot, revert the last commit implementating (1) and instead implement (4), choosing the second offset to have the lowest combined frequency across both needles

Implemented in fcbb1be. Reverted the early exit optimization and instead implemented shared offset optimization (4):

  • Added CharacterFrequencyHelper.GetSharedSecondCharacterOffset() which picks the offset with lowest combined frequency across both values
  • Changed from separate offsets per value (3 loads: input0, inputV0Ch2, inputV1Ch2) to a single shared offset (2 loads: input0, inputCh2Shared)
  • All 52,148 tests pass

@stephentoub
Copy link
Member

@EgorBot -amd -intel -arm

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Buffers;

BenchmarkSwitcher.FromAssembly(typeof(TwoStringSearchValuesBenchmark).Assembly).Run(args);

[MemoryDiagnoser]
public class TwoStringSearchValuesBenchmark
{
    private string _haystackBestCase = default!;
    private string _haystackWorstCase = default!;
    private SearchValues<string> _twoStringsOrdinal = default!;
    private SearchValues<string> _twoStringsIgnoreCase = default!;
    private SearchValues<string> _worstCaseOrdinal = default!;

    [Params(100, 1000, 10000)]
    public int HaystackLength { get; set; }

    [Params("Early", "Middle", "Late", "None")]
    public string MatchPosition { get; set; } = default!;

    [GlobalSetup]
    public void Setup()
    {
        Random rand = new(42);
        
        // Best case: haystack has no 'h' or 'w' (first chars of "hello"/"world")
        char[] charsBest = new char[HaystackLength];
        for (int i = 0; i < charsBest.Length; i++)
            charsBest[i] = (char)('m' + rand.Next(10)); // m-v range, no h or w
        
        // Worst case: haystack has many 'a's (first char of both needles)
        char[] charsWorst = new char[HaystackLength];
        for (int i = 0; i < charsWorst.Length; i++)
            charsWorst[i] = rand.Next(4) == 0 ? 'a' : (char)('b' + rand.Next(10)); // 25% 'a's

        string needle1Best = "hello";
        string needle2Best = "world";
        string needle1Worst = "abc";
        string needle2Worst = "axy";

        int insertPos = MatchPosition switch
        {
            "Early" => 10,
            "Middle" => HaystackLength / 2,
            "Late" => HaystackLength - 10,
            _ => -1
        };

        if (insertPos >= 0 && insertPos + needle1Best.Length <= HaystackLength)
        {
            needle1Best.AsSpan().CopyTo(charsBest.AsSpan(insertPos));
            needle1Worst.AsSpan().CopyTo(charsWorst.AsSpan(insertPos));
        }

        _haystackBestCase = new string(charsBest);
        _haystackWorstCase = new string(charsWorst);

        _twoStringsOrdinal = SearchValues.Create([needle1Best, needle2Best], StringComparison.Ordinal);
        _twoStringsIgnoreCase = SearchValues.Create([needle1Best, needle2Best], StringComparison.OrdinalIgnoreCase);
        _worstCaseOrdinal = SearchValues.Create([needle1Worst, needle2Worst], StringComparison.Ordinal);
    }

    [Benchmark]
    public int BestCase_Ordinal() => _haystackBestCase.AsSpan().IndexOfAny(_twoStringsOrdinal);

    [Benchmark]
    public int BestCase_IgnoreCase() => _haystackBestCase.AsSpan().IndexOfAny(_twoStringsIgnoreCase);

    [Benchmark]
    public int WorstCase_Ordinal() => _haystackWorstCase.AsSpan().IndexOfAny(_worstCaseOrdinal);
}

@stephentoub
Copy link
Member

@copilot, please try uncommenting (and fixing if necessary) the regex code at

// TODO: While some benchmarks benefit from this significantly, others regressed a bit (in particular those with few
, and ensuring that all regex tests still pass.

Copy link
Contributor Author

Copilot AI commented Dec 31, 2025

@copilot, please try uncommenting (and fixing if necessary) the regex code at https://github.com/dotnet/runtime/blob/ec724071a0808797d40c7a83e5067922fde0d089/src/libraries/System.Text.RegularExpressio...

Enabled in 41b248e. Uncommented the case-sensitive prefix matching code block. All 31,470 regex tests pass (30,469 functional + 1,001 unit).

@stephentoub
Copy link
Member

@MihuBot benchmark Regex_Industry https://github.com/MihaZupan/performance/tree/compiled-regex-only -medium

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consider adding a SearchValues<string> specialization for two or three input strings

3 participants