Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System.Drawing.Common-8.0.8 upgrade to 9.0.0 crashes my application #111965

Open
boennhoff opened this issue Dec 4, 2024 · 18 comments
Open

System.Drawing.Common-8.0.8 upgrade to 9.0.0 crashes my application #111965

boennhoff opened this issue Dec 4, 2024 · 18 comments
Assignees
Labels
area-System.Reflection needs-author-action An issue or pull request that requires more info or actions from the author. needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration no-recent-activity
Milestone

Comments

@boennhoff
Copy link

.NET version

target: net8.0-windows
installed dotnet sdk: 9.0.100, 8.0.404 & 8.0.206 (for whatever reason I have multiple)

Did it work in .NET Framework?

Yes

Did it work in any of the earlier releases of .NET Core or .NET 5+?

not tested

Issue description

Recently I had warnings about vulnerable package references, so I upgraded all possible Microsoft packages and tested my application thoroughly, when I stumbled over a very strange bug:

Running in a debug session, my application opens one of our forms and as soon as I load data the complete Debug session kills itself without any notice. No break on any exception, no log output, the process is just gone, and VS sits there as if no debug session was started.

Steps to reproduce

After a git bisect I determined the root cause: when changing a csproj file from

<ItemGroup>
  [..]
  <PackageReference Include="System.Drawing.Common" Version="8.0.8" />
  [..]
</ItemGroup>

to

<ItemGroup>
  [..]
  <PackageReference Include="System.Drawing.Common" Version="9.0.0" />
  [..]
</ItemGroup>

the bug appears. In the same commit I also upgraded many other packages from 8.0.* to 9.0.0, like System.Configuration.ConfigurationManager and System.Management, but only with System.Drawing.Common-9.0.0 this bug happens.

We are using System.Drawing.Common mainly for Image manipulation (Resize, etc.) and Bitmap creation, the UI itself is 3rd-party (managed through COM interops) not Windows.Forms.

I am a bit stuck in finding more information on this, or better describing this issue, I am posting here because the NuGet-package links here. Does anyone have an idea on how to find out more?

My workaround is of course, to not upgrade this package, but this can't be the solution in the long-run...

@boennhoff boennhoff added the untriaged New issue has not been triaged by the area owner label Dec 4, 2024
@lonitra
Copy link
Member

lonitra commented Dec 4, 2024

@boennhoff are you able to provide a call stack for your crash or repro steps? Having these is necessary for our investigation.

@boennhoff
Copy link
Author

@lonitra I will try: can you tell me how to get a Stacktrace and a dump, when the .NET runtime crashes? Can this be done within VS? Am I missing a setting, maybe disable "MyCode only"?

About repro: Since our application is an extension to another application that needs to run, I first need to tackle down the problem to the exact call that leads to the crash for an idea on how to reproduce the bug without the external application.

@Zheng-Li01
Copy link
Member

@boennhoff
Copy link
Author

boennhoff commented Dec 9, 2024

@Zheng-Li01 thanks, but I did not found an obvious way to create a dump on crash during Debug in VS, so I ran it without debugging and attached ProcDump (-ma -t) to create a dump before termination.

I tried to extract some information from it, but to no avail. Since it's a full dump I can't disclose it publicly; how can I share it with Microsoft (~83MiB zipped)?

What I forgot to mention yet is the exit code/termination cause in the Debug output: 0x0..05 Access Violation, it's a simple one-liner

@boennhoff
Copy link
Author

boennhoff commented Dec 9, 2024

I found the culprit, it was a stack overflow in a recursive method in our code. It was triggered by a change in class System.Drawing.Graphics obviously done between 8.0.8 and 9.0.0.

It looks like new/changed internal fields have a type we did not check for yet, passing an object of type Windows.Win32.Graphics.GdiPlus.GpGraphics* was one of the baddies...

The fix was applied to this redacted piece of code:

private void RecursiveMethod(object obj) {
  // collect object info or continue below...
  ...

  // dive into object fields
  var fields = obj.GetType().GetFields(BindingFlags.NonPublic | BindingFlags.Instance);
  foreach (var field in fields) {
    var fieldObject = field.GetValue(obj);
    if (fieldObject != null
        && !field.IsStatic
        && field.FieldType != typeof(IntPtr)
        && field.FieldType != typeof(Pointer)
        && field.FieldType != typeof(void*)  // <-- this line fixed it!
        && field.FieldType != typeof(void)) {
      RecursiveMethod(fieldObject);
    }
  }
}

I also implemented another check for a maximum depth to circumvent such a problem in the future. The method itself feels hacky, and I actually dislike it, but we found no other way around yet.

Either way, shouldn't the reaction of VS be different with problems like this? An Access Violation should not the be the outcome, or should it? Shall I leave this issue open for further analysis (*.dmp upload?) or close it?

@merriemcgaw
Copy link
Member

It's worth having the *.dmp upload just in case we do need to investigate further. You're right - a dev time Access Violation is not something we really want hanging around! Thank you so much!

@JeremyKuhne
Copy link
Member

@boennhoff ultimately this isn't something we can really fix, the AV is because you were reflecting into and operating on internals. If there is some scenario where you could benefit from additional public API we can discuss it.

@JeremyKuhne JeremyKuhne removed the untriaged New issue has not been triaged by the area owner label Jan 8, 2025
@boennhoff
Copy link
Author

boennhoff commented Jan 28, 2025

@merriemcgaw I would upload the *.dmp.zip, but this would need to be done securely not publicly. Does your employer provide any means here to do that? Upload portal anyone?

@JeremyKuhne Of course the problem was our fault, but isn't there a better way to handle such a case? Shouldn't VS at least break in the recursive method instead of killing the debug session? Pointing to the root cause of the AV?

To the question of a possible public API, I would really like to discuss a cleaner solution:
We are generating a non-runtime based object hash (its content), meaning the hash should stay stable over several runs of the application. Our approach is to recursively collect hashes of primitives and strings with GetHashCode() while avoiding runtime based fields.
It shall be fast and predict if we could reuse a formerly generated output from the hashed object. So there is no need in cryptography safety or high collision protection, worst case is extra processing time & additional cache storage. Currently the resulting hash has 64 bits.
This solution works for all kind of different implementations, and besides being hacky we had no real problems with it up to now.
I hope my explanation can be understood. So, is there a public API we could use for that, or can you think of one?

@JeremyKuhne
Copy link
Member

@boennhoff

Generating a hash via recursive reflection and opt-outs (rather than opt-in) is somewhat risky. That said, it seems to me like there might be an underlying issue in reflection- it shouldn't AV. I would presume you can write a standalone repro for this by reflecting on a simple type with a pointer field in it. If you can do that it would be really helpful. I'll find and tag the right person for this on the runtime.

@boennhoff
Copy link
Author

boennhoff commented Jan 28, 2025

@JeremyKuhne Well, that sounds doable: I'll take the challenge! But give me a few days, won't make it this week for sure...

We opted-in in the first place but missed a few unknown types and had good experience and stable hashes with the opt-out way. It outperformed other techniques we tried like different serialization variants in speed and hash stability, namely because of the internal fields.

I hope the risky part is prevented now by the recursion depth check I added.

@steveharter
Copy link
Member

steveharter commented Jan 29, 2025

@JeremyKuhne can you transfer this to runtime\reflection and assign it me. Thans. It appears GetValue() is AVing on a native pointer.

Either way, shouldn't the reaction of VS be different with problems like this? An Access Violation should not the be the outcome, or should it? Shall I leave this issue open for further analysis (*.dmp upload?) or close it?

Generating a hash via recursive reflection and opt-outs (rather than opt-in) is somewhat risky. That said, it seems to me like there might be an underlying issue in reflection- it shouldn't AV.

@JeremyKuhne JeremyKuhne transferred this issue from dotnet/winforms Jan 29, 2025
@dotnet-policy-service dotnet-policy-service bot added untriaged New issue has not been triaged by the area owner and removed no-recent-activity labels Jan 29, 2025
@JeremyKuhne JeremyKuhne added area-System.Reflection and removed area-System.Drawing untriaged New issue has not been triaged by the area owner labels Jan 29, 2025
@steveharter
Copy link
Member

I would presume you can write a standalone repro for this by reflecting on a simple type with a pointer field in it.

Well, that sounds doable: I'll take the challenge! But give me a few days, won't make it this week for sure...

I'll keep this issue open to investigate that once it is ready.

It looks like new/changed internal fields have a type we did not check for yet, passing an object of type Windows.Win32.Graphics.GdiPlus.GpGraphics* was one of the baddies...

That was done here: https://github.com/dotnet/winforms/pull/10628/files#diff-90928a5d955b1fe1be1036c7d4eed3056b0981abc13c512e48f0bf88065249e0R292

-    internal IntPtr NativeGraphics { get; private set; }
+    internal GpGraphics* NativeGraphics { get; private set; }

which means the FieldType is now typeof(GpGraphics*) instead of typeof(IntPtr)

Also in the original description here, I'm not sure why the added LOC

        && field.FieldType != typeof(void*)  // <-- this line fixed it!

would have prevented the AV unless the code not shown attempted to access the native memory and the pointer is not valid or the reading continued too far for example.

@steveharter steveharter added the needs-author-action An issue or pull request that requires more info or actions from the author. label Feb 5, 2025
@steveharter
Copy link
Member

ping @boennhoff on repro.

@boennhoff
Copy link
Author

boennhoff commented Feb 20, 2025

Hi again @steveharter @JeremyKuhne,
sorry for the delay, work didn't left time for this follow-up.

I created a stripped down version of the problem, but I am not able to reproduce the actual VS crash issue with it anymore. VS halts correctly and I get a proper System.StackOverflowException now. I might have stripped down too much, it's just the bitmap drawing and hashing now...

I wonder if it might be related to our interaction with a COM-Interop: The debug session crashes (still, with latest System.Drawing.Common-9.0.2 and latest VS) when the hashing code executes in the thread of an event fired by a COM-Interop... Maybe the involved RCW changes VS behavior here. But I can't share the COM-Interop and the proprietary application behind it, and I have no idea yet how to mock it.

The PoC-repository actually just proofs that we had a bug triggered by a change in your libraries. I am a bit stuck now.

But I am still able to share the crash memory dump (*.dmp.zip), just not publicly. But if you could provide me an upload-link to a private space?

@dotnet-policy-service dotnet-policy-service bot added the needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration label Feb 20, 2025
@dotnet-policy-service dotnet-policy-service bot removed needs-author-action An issue or pull request that requires more info or actions from the author. no-recent-activity labels Feb 20, 2025
@carlossanlop
Copy link
Member

But I am still able to share the crash memory dump (*.dmp.zip), just not publicly. But if you could provide me an upload-link to a private space?

@boennhoff please upload the zip file to your OneDrive, then share the link to that zip file with @JeremyKuhne via email or any other private means of communication.

@JeremyKuhne
Copy link
Member

This should go to @steveharter. One other way to do this securely is to open a VS Feedback ticket, where you can safely share things like this and provide the link to your ticket. https://developercommunity.visualstudio.com/

@ericstj ericstj added this to the 10.0.0 milestone Mar 7, 2025
@ericstj ericstj added the needs-author-action An issue or pull request that requires more info or actions from the author. label Mar 7, 2025
Copy link
Contributor

This issue has been marked needs-author-action and may be missing some important information.

Copy link
Contributor

This issue has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Reflection needs-author-action An issue or pull request that requires more info or actions from the author. needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration no-recent-activity
Projects
None yet
Development

No branches or pull requests

8 participants