Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AccessViolationException when calling Process.HandleCount #107503

Open
benaadams opened this issue Sep 7, 2024 · 10 comments
Open

AccessViolationException when calling Process.HandleCount #107503

benaadams opened this issue Sep 7, 2024 · 10 comments
Labels
area-System.IO bug needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration tenet-reliability Reliability/stability related issue (stress, load problems, etc.)
Milestone

Comments

@benaadams
Copy link
Member

Description

Calling Process.HandleCount on a cached handle can cause an AccessViolationException on Linux (WSL)

Reproduction Steps

https://github.com/prometheus-net/prometheus-net/blob/60e9106a83ff1274fec0022c37366f04822b1d1b/Prometheus/DotNetStats.cs#L74-L97

public sealed class DotNetStats
{
    private readonly Process _process;
    private Gauge _openHandles;

    private DotNetStats(IMetricFactory metricFactory)
    {
        _process = Process.GetCurrentProcess();
        _openHandles = metricFactory.CreateGauge("process_open_handles", "Number of open handles");
    }
	   
    private readonly object _updateLock = new object();
    public void UpdateMetrics()
    {
        try
        {
            lock (_updateLock)
            {
                _process.Refresh();
				
                _openHandles.Set(_process.HandleCount);
            }
        }
        catch (Exception)
        {
        }
    }
}

Expected behavior

Calling UpdateMetrics sets the number of process handles

Actual behavior

Infrequent (about 6 hours of being called every 5 seconds)

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.SpanHelpers+DontNegate`1[[System.Byte, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32)
   at System.IO.Enumeration.FileSystemEntry.Initialize(System.IO.Enumeration.FileSystemEntry ByRef, DirectoryEntry, System.ReadOnlySpan`1<Char>, System.ReadOnlySpan`1<Char>, System.ReadOnlySpan`1<Char>, System.Span`1<Char>)
   at System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Collections.Generic.List`1[[System.__Canon, System.Private.CoreLib, Version=8.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]..ctor(System.Collections.Generic.IEnumerable`1<System.__Canon>)
   at System.Diagnostics.Process.EnsureHandleCountPopulated()
   at Prometheus.DotNetStats.UpdateMetrics()

Regression?

Not sure as first time have run on WSL; haven't had any issues on Windows or seen any complaints from Linux, macOS users

Known Workarounds

No workaround as AccessViolationException always bubbles through catches to crash process

Configuration

WSL on Windows 11 23H2 (OS Build 22631.4112)

.NET SDK:
 Version:           8.0.108
 Commit:            665a05cea7
 Workload version:  8.0.100-manifests.109ff937

Runtime Environment:
 OS Name:     ubuntu
 OS Version:  22.04
 OS Platform: Linux
 RID:         ubuntu.22.04-x64
 Base Path:   /usr/lib/dotnet/sdk/8.0.108/

.NET workloads installed:
 Workload version: 8.0.100-manifests.109ff937
There are no installed workloads to display.

Host:
  Version:      8.0.8
  Architecture: x64
  Commit:       08338fcaa5

.NET SDKs installed:
  8.0.108 [/usr/lib/dotnet/sdk]

.NET runtimes installed:
  Microsoft.AspNetCore.App 8.0.8 [/usr/lib/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 8.0.8 [/usr/lib/dotnet/shared/Microsoft.NETCore.App]

Other information

Aside; creating all the strings for the filenames in /proc/{pid}/fd just to find the count of files doesn't seem very efficient

_processInfo.HandleCount = Directory.GetFiles(path, "*", SearchOption.TopDirectoryOnly).Length;

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Sep 7, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-diagnostics-process
See info in area-owners.md if you want to be subscribed.

@benaadams
Copy link
Member Author

Though I suppose its more Directory.GetFiles on Linux (WSL) than necessarily Process; as that just determines the property of the directory being queried

@jeffhandley jeffhandley added this to the Future milestone Sep 16, 2024
@jeffhandley jeffhandley added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner labels Sep 16, 2024
@Martin-Molinero
Copy link

Martin-Molinero commented Feb 4, 2025

Getting a similar errors in linux net9, some stack traces below, file enumeration seems unstable:

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.SpanHelpers+DontNegate`1[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32)
   at System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Linq.Enumerable+IEnumerableWhereSelectIterator`2[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.Collections.Generic.KeyValuePair`2[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].ToArray()

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.SpanHelpers+DontNegate`1[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32)
   at Interop+Sys+DirectoryEntry.GetName(System.Span`1<Char>)
   at System.IO.Enumeration.FileSystemEnumerableFactory+<>c__DisplayClass2_0.<UserFiles>b__1(System.IO.Enumeration.FileSystemEntry ByRef)
   at System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.SpanHelpers+DontNegate`1[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32)
   at System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Collections.Generic.List`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]..ctor(System.Collections.Generic.IEnumerable`1<System.__Canon>)
   at System.IO.Directory.GetFiles(System.String, System.String, System.IO.EnumerationOptions)

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.SpanHelpers+DontNegate`1[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32)
   at System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Linq.Enumerable+IEnumerableWhereIterator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].ToArray()

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.SpanHelpers+DontNegate`1[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32)
   at System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Linq.Enumerable+IEnumerableWhereIterator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].ToArray()

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.SpanHelpers+DontNegate`1[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32)
   at System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Linq.Enumerable+IEnumerableWhereIterator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].ToArray()

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.SpanHelpers+DontNegate`1[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32)
   at System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
   at System.Collections.Generic.List`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]..ctor(System.Collections.Generic.IEnumerable`1<System.__Canon>)

@jkotas
Copy link
Member

jkotas commented Feb 5, 2025

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at System.SpanHelpers.NonPackedIndexOfValueType

This looks like some subtle bug in the unsafe code in NonPackedIndexOfValueType. cc @MihaZupan @EgorBo

@Martin-Molinero Any chance you can share a crash dump for this failure? The best way to share crash dump with us is to open an issue on https://developercommunity.visualstudio.com/ and attach the crash dump to it as a private attachment. Unlike github issues, https://developercommunity.visualstudio.com/ does not have problematic limits on the size of attachments and the attachments are not publicly visible.

@jkotas jkotas added the tenet-reliability Reliability/stability related issue (stress, load problems, etc.) label Feb 5, 2025
@MihaZupan
Copy link
Member

MihaZupan commented Feb 5, 2025

Given how commonly used the helper is (NonPackedIndexOfValueType is the implementation for IndexOf(byte)), I find it suspicious that it's only showing up from System.IO.Enumeration.

The only use of IndexOf(byte) I see around those stacks is in the Unix Interop DirectoryEntry implementation here:

internal byte* Name;
internal int NameLength;
internal NodeType InodeType;
internal const int NameBufferSize = 256; // sizeof(dirent->d_name) == NAME_MAX + 1

ReadOnlySpan<byte> nameBytes = (NameLength == -1)
// In this case the struct was allocated via struct dirent *readdir(DIR *dirp);
? new ReadOnlySpan<byte>(Name, new ReadOnlySpan<byte>(Name, NameBufferSize).IndexOf<byte>(0))
: new ReadOnlySpan<byte>(Name, NameLength);

I bet this should be using MemoryMarshal.CreateReadOnlySpanFromNullTerminated instead, and we're AVing because the Name isn't a full 256 byte buffer.

Looks like we were using Marshal.PtrToStringAnsi here before dotnet/coreclr#21622

// We use Marshal.PtrToStringAnsi on the managed side, which takes a pointer to

#if HAVE_DIRENT_NAME_LEN
outputEntry->NameLength = entry->d_namlen;
#else
outputEntry->NameLength = -1; // sentinel value to mean we have to walk to find the first \0
#endif

Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

@Martin-Molinero
Copy link

Martin-Molinero commented Feb 5, 2025

Hey! Sorry tried quite a few times to reproduce today enabling the dump generation and guess what, can't reproduce, if I see it again I'll share a dump asap 💯

just some extra details: seen it happen enumerating ~100k files order of magnitude, big host (ram/cores), lots of action happening => sounds like an extreme race condition

@jkotas
Copy link
Member

jkotas commented Feb 5, 2025

we're AVing because the Name isn't a full 256 byte buffer.

How would that happen? Linux has HAVE_READDIR_R defined and we should be allocating the buffer of sufficient size ourselves for it here:

int size = Interop.Sys.GetReadDirRBufferSize();
_entryBuffer = size > 0 ? ArrayPool<byte>.Shared.Rent(size) : null;

@benaadams
Copy link
Member Author

benaadams commented Feb 5, 2025

Reading unaligned vector beyond page size?

Had this issue previously when using IndexOf for searching for null terminators of strings

(Terminator within page, but vector read spills over)

@MihaZupan
Copy link
Member

MihaZupan commented Feb 5, 2025

Yep, that's what my guess was, which is resolvable by using CreateReadOnlySpanFromNullTerminated instead which ensures that vector reads don't cross page boundaries (or 16B).
But the current logic that uses IndexOf would only be problematic if the span length is incorrect (we'll never read past the length).
To Jan's question, I'm not sure how that could happen yet, but should presumably be visible in a dump.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.IO bug needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration tenet-reliability Reliability/stability related issue (stress, load problems, etc.)
Projects
None yet
Development

No branches or pull requests

5 participants