tracking module name overriding

**Feature**

I'm interested in adding functionality so that mypy will track modules that rename themselves with an explicit `__name__`, and classes which set an explicit `__module__` value. This would then show up in user-facing messages and also affect tooling like stubgen / stubtest. The general motivation is aligning more closely with runtime this way.

**Pitch**

Some discussion of this has already occurred in this typeshed issue: https://github.com/python/typeshed/issues/11141

And I made a quick proof-of-concept implementation of class-based `__module__` tracking. The implementation is not very polished, and not necessarily what I'd like to pursue, but it *does* work: https://github.com/python/mypy/pull/16698

Implementing module-level `__name__` tracking would require a bigger change, I think, mostly focused on the `fullname` attribute of classes in `mypy/nodes.py`. I'm new to the mypy codebase, but after spending some time poking around my general idea right now is something like this:

Right now, a bunch of the node classes keep a `_fullname` attribute, accessible via the `fullname` property. I think I can split that into three values: qualname, module_name, and module_id, where module_id is what mypy currently uses, and module_name is usually the same as module_id but can be modified by `__name__` and `__module__` values. Then the `fullname` property can be generated on demand, and the static module_id is still available for lookup purposes when it's needed.

I saw that the switch from `.fullname()` methods to the `.fullname` property was a concern for API stability, so I'm trying to be mindful of this. I think it can be done in a way that mostly affects places that currently set `._fullname` values during semantic analysis, without affecting read-only uses of `.fullname`. `SemanticAnalyzer.qualified_name()` would probably be affected too, but I don't fully understand all the quirks of that, and I'm pretty sure there's specific situations it's carefully designed for that I don't know to think about yet.

The far end of the effort scale, this could include avoiding reliance on module_id more broadly. Currently it seems key to `local_definitions()`, where the alternative would probably be explicit tracking of what was added to a scope by an import statement. It also seems important to incremental features and lookup related functionality, neither of which I have a great grasp on at the moment. I'm not sure what those could look like without relying on module_id to match the file name and not change. It's probably more disruptive than it's worth. Changing `fullname` to use `module_name` instead of `module_id` is appealing to me, but probably also disruptive for not much actual gain.

At the low end of the effort scale would be not implementing module-level `__name__` tracking at all, and sticking with just class-level `__module__` tracking. That's relatively simple (see the MR I linked earlier), and gets most of the actual functionality that I'm looking for, at the cost of additional duplication in typeshed stubs and lack of support for adjusting the fullname of anything other than classes.

There's probably also a possible implementation that does cover module-level `__name__` tracking, but avoids touching anything about the current `._fullname` or `.fullname` attributes. That looks something like that draft MR I linked above, but expanded to cover `MypyFile` and other `SymbolNode` subclasses, with appropriate adjustments to `SemanticAnalyzer` to propagate the `module_override` value as needed.

Overall the benefits are... let's say modest. Some user-facing messaging could improve it's fidelity with runtime naming, once typeshed was updated to include these definitions, and some parts of stubtest could get a little better. Typeshed could more closely mirror runtime in certain respects.

I started thinking about this after working on some cleanups in typeshed. I got even more interested in this work after I started looking at the work of splitting `collections.abc` classes from their aliases in `typing`. Runtime keeps these implementations in `_collections_abc.py` for performance reasons, with a module-level `__name__ = "collections.abc"` set. It'd be nice if typeshed could do the same; I think it's better for resolving the cycle that includes typing and builtins than having those in `collections/abc.pyi` would be? But I'm not sure I'm reasoning about that correctly.

All that said, what do mypy maintainers think?

 a) Is this a reasonable idea to pursue?

 b) is the idea of splitting up the internal `._fullname` a reasonable way to do it or am I missing something that would make that impractical or too disruptive?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

tracking module name overriding #16758

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

tracking module name overriding #16758

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions