Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracking module name overriding #16758

Closed
tungol opened this issue Jan 7, 2024 · 0 comments
Closed

tracking module name overriding #16758

tungol opened this issue Jan 7, 2024 · 0 comments
Labels

Comments

@tungol
Copy link
Contributor

tungol commented Jan 7, 2024

Feature

I'm interested in adding functionality so that mypy will track modules that rename themselves with an explicit __name__, and classes which set an explicit __module__ value. This would then show up in user-facing messages and also affect tooling like stubgen / stubtest. The general motivation is aligning more closely with runtime this way.

Pitch

Some discussion of this has already occurred in this typeshed issue: python/typeshed#11141

And I made a quick proof-of-concept implementation of class-based __module__ tracking. The implementation is not very polished, and not necessarily what I'd like to pursue, but it does work: #16698

Implementing module-level __name__ tracking would require a bigger change, I think, mostly focused on the fullname attribute of classes in mypy/nodes.py. I'm new to the mypy codebase, but after spending some time poking around my general idea right now is something like this:

Right now, a bunch of the node classes keep a _fullname attribute, accessible via the fullname property. I think I can split that into three values: qualname, module_name, and module_id, where module_id is what mypy currently uses, and module_name is usually the same as module_id but can be modified by __name__ and __module__ values. Then the fullname property can be generated on demand, and the static module_id is still available for lookup purposes when it's needed.

I saw that the switch from .fullname() methods to the .fullname property was a concern for API stability, so I'm trying to be mindful of this. I think it can be done in a way that mostly affects places that currently set ._fullname values during semantic analysis, without affecting read-only uses of .fullname. SemanticAnalyzer.qualified_name() would probably be affected too, but I don't fully understand all the quirks of that, and I'm pretty sure there's specific situations it's carefully designed for that I don't know to think about yet.

The far end of the effort scale, this could include avoiding reliance on module_id more broadly. Currently it seems key to local_definitions(), where the alternative would probably be explicit tracking of what was added to a scope by an import statement. It also seems important to incremental features and lookup related functionality, neither of which I have a great grasp on at the moment. I'm not sure what those could look like without relying on module_id to match the file name and not change. It's probably more disruptive than it's worth. Changing fullname to use module_name instead of module_id is appealing to me, but probably also disruptive for not much actual gain.

At the low end of the effort scale would be not implementing module-level __name__ tracking at all, and sticking with just class-level __module__ tracking. That's relatively simple (see the MR I linked earlier), and gets most of the actual functionality that I'm looking for, at the cost of additional duplication in typeshed stubs and lack of support for adjusting the fullname of anything other than classes.

There's probably also a possible implementation that does cover module-level __name__ tracking, but avoids touching anything about the current ._fullname or .fullname attributes. That looks something like that draft MR I linked above, but expanded to cover MypyFile and other SymbolNode subclasses, with appropriate adjustments to SemanticAnalyzer to propagate the module_override value as needed.

Overall the benefits are... let's say modest. Some user-facing messaging could improve it's fidelity with runtime naming, once typeshed was updated to include these definitions, and some parts of stubtest could get a little better. Typeshed could more closely mirror runtime in certain respects.

I started thinking about this after working on some cleanups in typeshed. I got even more interested in this work after I started looking at the work of splitting collections.abc classes from their aliases in typing. Runtime keeps these implementations in _collections_abc.py for performance reasons, with a module-level __name__ = "collections.abc" set. It'd be nice if typeshed could do the same; I think it's better for resolving the cycle that includes typing and builtins than having those in collections/abc.pyi would be? But I'm not sure I'm reasoning about that correctly.

All that said, what do mypy maintainers think?

a) Is this a reasonable idea to pursue?

b) is the idea of splitting up the internal ._fullname a reasonable way to do it or am I missing something that would make that impractical or too disruptive?

@tungol tungol added the feature label Jan 7, 2024
@tungol tungol closed this as completed Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant