Skip to content

Conversation

ilevkivskyi
Copy link
Member

@ilevkivskyi ilevkivskyi commented Oct 12, 2025

Ref #933

Instead of processing SCCs layer by layer, we will now process an SCC as soon as it is ready. This logic is easier to adapt for parallel processing, and should get us more benefit from parallelization (as more SCCs can be processed in parallel). I tried to make order with single worker stable and very similar (or maybe even identical) to the current order.

Note I already add some methods to the build manager to emulate parallel processing, but they are not parallel yet.

This comment has been minimized.

@ilevkivskyi
Copy link
Member Author

Hm, the mypy_primer looks unexpected. The previous errors look like obvious false positives, not sure what caused those. I will try to dig a bit deeper tomorrow, but if I will not find anything, I think it is fine to ignore (since as I said errors were false positives).

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's exciting to see the foundation for parallel checking moving forward!

Not a full review -- did a quick pass only. I wonder if the missing errors might indicate that some modules aren't processed -- seems like worth double checking that no modules aren't silently ignored, just in case.

mypy/build.py Outdated
if fresh:
done = fresh
else:
done, processing = manager.get_done(graph)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment explaining that this processes some SCCs / waits for some SCCs to be done? When reading this for the first time, I thought this would just get whatever is already done right now, but it actually does some work. Or maybe the method name could be more descriptive.

Maybe rename processing to has_more or similar -- at first I was confused by it.

If we'd be doing parallel processing, would this perform a busy loop while waiting for things to happen? Or would get_done wait until something has finished processing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we'd be doing parallel processing, would this perform a busy loop while waiting for things to happen? Or would get_done wait until something has finished processing?

I was thinking about something like select() waiting for worker sockets. So this will block until some stale SCC(s) get finished (since we know there is nothing else we can progress). IIUC select() should not busy loop (and also I guess there is a Windows equivalent for this if we would use named pipes).

Btw I think my initial implementation will be Linux/Mac only, I was reading more about it, and the situation there is a bit of a mess. There are already a bunch of other things to worry, so I guess it is OK to add Windows support later.

@ilevkivskyi
Copy link
Member Author

I wonder if the missing errors might indicate that some modules aren't processed -- seems like worth double checking that no modules aren't silently ignored, just in case

Yeah, I was going to clone that repo and do this.

@ilevkivskyi
Copy link
Member Author

ilevkivskyi commented Oct 13, 2025

@JukkaL

I wonder if the missing errors might indicate that some modules aren't processed -- seems like worth double checking that no modules aren't silently ignored, just in case

Yeah, I was going to clone that repo and do this.

OK, so I checked and I think this is a bug in mypy_primer.

When I check:

  • I see a bunch of obvious false positives (of the kind module has no attribute) on both 1.18.2 and on this PR.
  • I see that on my PR that file is indeed processed (by introducing a type error there).

IIUC the problem is that we are supposed to install scipy-stubs itself, before we can run assert_type()-style tests on it. Otherwise we will pick up scipy package (which it depends on). That said:

  • I am not sure why this PR makes mypy "prefer" some of the stubs without installing them. Maybe this is some subtlety caused by module processing order?
  • I noticed that after I installed scipy-stubs some caches were not invalidated, so I see Library stubs not installed after I literally just installed them. I guess it is confusing for mypy to have all three: scipy, scipy-stubs installed, and scipy-stubs passed as sources.

If you look at the scipy-stubs CI, they do everything right, they have two separate jobs:

  • One does mypy scipy-stubs only (i.e. type-checks the stubs)
  • Another does pip install . && mypy tests (i.e. tests the installed stubs)

so mypy_primer should do something similar instead of mypy . (maybe also for other xxx-stubs repos) cc @hauntsaninja

Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

scipy-stubs (https://github.com/scipy/scipy-stubs)
- tests/datasets/test_utils.pyi:6: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:6: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:9: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:9: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:9: error: Module has no attribute "ascent"  [attr-defined]
- tests/datasets/test_utils.pyi:10: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:10: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:10: error: Module has no attribute "ascent"  [attr-defined]
- tests/datasets/test_utils.pyi:11: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:11: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:11: error: Module has no attribute "ascent"  [attr-defined]
- tests/datasets/test_utils.pyi:14: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:14: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:14: error: Module has no attribute "electrocardiogram"  [attr-defined]
- tests/datasets/test_utils.pyi:15: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:15: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:15: error: Module has no attribute "electrocardiogram"  [attr-defined]
- tests/datasets/test_utils.pyi:16: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:16: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:16: error: Module has no attribute "electrocardiogram"  [attr-defined]
- tests/datasets/test_utils.pyi:19: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:19: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:19: error: Module has no attribute "face"  [attr-defined]
- tests/datasets/test_utils.pyi:20: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:20: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:20: error: Module has no attribute "face"  [attr-defined]
- tests/datasets/test_utils.pyi:21: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:21: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:21: error: Module has no attribute "face"  [attr-defined]
- tests/datasets/test_utils.pyi:24: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:24: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:24: error: Module has no attribute "ascent"  [attr-defined]
- tests/datasets/test_utils.pyi:24: error: Module has no attribute "electrocardiogram"  [attr-defined]
- tests/datasets/test_utils.pyi:24: error: Module has no attribute "face"  [attr-defined]
- tests/datasets/test_utils.pyi:26: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_utils.pyi:26: error: Module has no attribute "clear_cache"  [attr-defined]
- tests/datasets/test_utils.pyi:26: error: Module has no attribute "ascent"  [attr-defined]
- tests/datasets/test_utils.pyi:26: error: Module has no attribute "electrocardiogram"  [attr-defined]
- tests/datasets/test_utils.pyi:26: error: Module has no attribute "face"  [attr-defined]
- tests/datasets/test_fetchers.pyi:7: error: Expression is of type "Any", not "ndarray[tuple[int, int], dtype[unsignedinteger[_8Bit]]]"  [assert-type]
- tests/datasets/test_fetchers.pyi:7: error: Module has no attribute "ascent"  [attr-defined]
- tests/datasets/test_fetchers.pyi:9: error: Expression is of type "Any", not "ndarray[tuple[int], dtype[float64]]"  [assert-type]
- tests/datasets/test_fetchers.pyi:9: error: Module has no attribute "electrocardiogram"  [attr-defined]
- tests/datasets/test_fetchers.pyi:11: error: Expression is of type "Any", not "ndarray[tuple[int, int, int], dtype[unsignedinteger[_8Bit]]]"  [assert-type]
- tests/datasets/test_fetchers.pyi:11: error: Module has no attribute "face"  [attr-defined]
- tests/datasets/test_fetchers.pyi:12: error: Expression is of type "Any", not "ndarray[tuple[int, int, int], dtype[unsignedinteger[_8Bit]]]"  [assert-type]
- tests/datasets/test_fetchers.pyi:12: error: Module has no attribute "face"  [attr-defined]
- tests/datasets/test_fetchers.pyi:13: error: Expression is of type "Any", not "ndarray[tuple[int, int], dtype[unsignedinteger[_8Bit]]]"  [assert-type]
- tests/datasets/test_fetchers.pyi:13: error: Module has no attribute "face"  [attr-defined]
- tests/datasets/test_download_all.pyi:6: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_download_all.pyi:6: error: Module has no attribute "download_all"  [attr-defined]
- tests/datasets/test_download_all.pyi:7: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_download_all.pyi:7: error: Module has no attribute "download_all"  [attr-defined]
- tests/datasets/test_download_all.pyi:8: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_download_all.pyi:8: error: Module has no attribute "download_all"  [attr-defined]
- tests/datasets/test_download_all.pyi:9: error: Expression is of type "Any", not "None"  [assert-type]
- tests/datasets/test_download_all.pyi:9: error: Module has no attribute "download_all"  [attr-defined]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants