Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Another OCIO related crash #663

Open
1 task done
kenmcgaugh opened this issue Jan 21, 2025 · 1 comment
Open
1 task done

[Bug]: Another OCIO related crash #663

kenmcgaugh opened this issue Jan 21, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@kenmcgaugh
Copy link
Contributor

What happened?

I'm now experiencing another OCIO related crash in both commercial RV-2024.2.0 and OpenRV (latest) under MacOS. It always crashes in an IPGraph Eval thread and looks like a FrameBuffer member is accessed after the FrameBuffer instance has been deleted. I can prevent the crash if I disable caching using the "-nc" commandline option. Attached is a repro with instructions on how to reproduce.

rvocio_crash.zip

List all the operating systems versions where this is happening

macOS 15.2

On what computer hardware is this happening?

MacBook Pro (Apple M1 Pro)

Relevant console log output

Here is the most common stacktrace:

* thread #46, name = 'IPGraph Eval #2', stop reason = EXC_BAD_ACCESS (code=1, address=0x37e8e87d8b480f04)
  * frame #0: 0x0000000112f3aa9a libTwkFB.dylib`TwkFB::FrameBuffer::identifier() const + 26
    frame #1: 0x0000000112f3ab79 libTwkFB.dylib`TwkFB::FrameBuffer::identifier() const + 249
    frame #2: 0x0000000112f3ab79 libTwkFB.dylib`TwkFB::FrameBuffer::identifier() const + 249
    frame #3: 0x0000000112f3ab79 libTwkFB.dylib`TwkFB::FrameBuffer::identifier() const + 249
    frame #4: 0x0000000112f3ab79 libTwkFB.dylib`TwkFB::FrameBuffer::identifier() const + 249
    frame #5: 0x0000000112f3ab79 libTwkFB.dylib`TwkFB::FrameBuffer::identifier() const + 249
    frame #6: 0x0000000100156718 RV`IPCore::Shader::operator<<(std::__1::basic_ostream<char, std::__1::char_traits<char>>&, IPCore::Shader::ImageOrFB const&) + 152
    frame #7: 0x00000001001649f0 RV`IPCore::Shader::Expression::outputHash(std::__1::basic_ostream<char, std::__1::char_traits<char>>&) const + 256
    frame #8: 0x00000001001b63e3 RV`IPCore::IPImage::computeRenderIDs() const + 691
    frame #9: 0x00000001001b7938 RV`IPCore::IPImage::computeRenderIDRecursive() + 40
    frame #10: 0x00000001001b7938 RV`IPCore::IPImage::computeRenderIDRecursive() + 40
    frame #11: 0x00000001001b7938 RV`IPCore::IPImage::computeRenderIDRecursive() + 40
    frame #12: 0x00000001001b7938 RV`IPCore::IPImage::computeRenderIDRecursive() + 40
    frame #13: 0x00000001001b7938 RV`IPCore::IPImage::computeRenderIDRecursive() + 40
    frame #14: 0x00000001001cfae0 RV`IPCore::IPGraph::evaluate(int, IPCore::IPNode::ThreadType, unsigned long) + 256
    frame #15: 0x00000001001d057a RV`IPCore::IPGraph::evalThreadMain(IPCore::IPGraph::EvalThreadData*) + 1130
    frame #16: 0x00000001001c9320 RV`IPCore::evalThreadTrampoline(IPCore::IPGraph::EvalThreadData*) + 272
    frame #17: 0x000000010b3709a7 libstl_ext.dylib`stl_ext::thread_group::worker_jump() + 791
    frame #18: 0x000000010b37040d libstl_ext.dylib`stl_ext::thread_group::thread_main(void*) + 173
    frame #19: 0x00007ff80935c253 libsystem_pthread.dylib`_pthread_start + 99
    frame #20: 0x00007ff809357bef libsystem_pthread.dylib`thread_start + 15

Environment variables

No response

Extra information

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@kenmcgaugh kenmcgaugh added the bug Something isn't working label Jan 21, 2025
@kenmcgaugh
Copy link
Contributor Author

To add a bit more context around this, the code that triggers the crash is in the test_ocio_setup.py around line 125 within the "create_display_all_action" method. I have found that disabling caching before the OCIO nodes are modified and then restoring the caching state afterwords prevents the crash. However it does make changing the display/view less interactive.

I spent quite a bit of time trying to track down the source of this but every attempt either didn't work or caused a deadlock. So I'm really uncertain if this is directly caused by the OCIOIPNode or if that node just produces shaders that are complex enough to cause a pre-existing race condition to stumble upon itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant