Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-129622: Clarify the set.is{sub,super}set docstrings #129637

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

AA-Turner
Copy link
Member

@AA-Turner AA-Turner commented Feb 4, 2025

@@ -2066,12 +2066,12 @@ set.issuperset
other: object
/

Report whether this set contains another set.
Return True if the set is a superset of other.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, the OP issue is about return value. I think we can replace "report" with "Return True if", but why change the rest? Superset/subset - terminology, that might be unfamiliar to readers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My specific suggestion was "Return True if self is a subset of other, else False.", with changing 'Report' with 'Return' the prime focus. I do not know if 'if else' is needed. Without looking around, I suspect both implicit and explicit alternatives are used and would be satified either way.

I used the parameter names (marked as italicized) as that is a general policy. For frozenset, the parameter names are used in all the dunder docstrings (but without *s) but never with the non-dunder functions. The look sloppy to me. The non-specific 'another set' seems weird to me compared to the specific 'other'. As for the superset relationship , the rewording of the repetitious 'is superset' as 'contains' (and similarly for 'is subset') is fine with me and even a good idea. My revised suggestion would then be "Return True if self contains other." and "Return True if other contains self." (both without or with ", else False")

Copy link
Member Author

@AA-Turner AA-Turner Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, none of the docstrings in setobject.c use 'self'. Other docstrings in the module reference self as "a" / "this" / "the" set. Perhaps?:

Return True if this set contains other. and Return True if this set is contained by other.

My slight preference for {sub/super}set is as "contains" is ambiguous for the case of set equality, where both methods return True.

@terryjreedy
Copy link
Member

terryjreedy commented Feb 4, 2025

Test failure (Check if generated files are up to date) is build failure of python3.13 ./Tools/build/generate_sbom.py with message RemoteDisconnected("Remote end closed connection without response"). Seems like random failure; rerunning the one test. EDIT: rerun passed.

@rhettinger rhettinger self-assigned this Feb 4, 2025
@rhettinger
Copy link
Contributor

Thoughts:

  • There isn't a real issue here. Python has a bunch of is... predicates and people seem to get that they all return booleans. AFAICT no actual user has ever had a problem with knowing that any these methods return a boolean.

  • Some of the predicates like str.isupper are worded in the form "Return True if ..., False otherwise". Others like isinstance are worded in the form "Return whether ...". Either form seems to suffice.

  • It would be step backwards to replace "another set contains this set" with "if the set is a subset of other." The latter is a circular definition and add no value. Consider possible docstrings for is_coprime(p, q). One says "returns True if p is coprime to q". Another says "Returns True is p has no factors in common with q". One just repeats the functions name and the other teaches you something about what it does.

  • If some change has to be made, then let's address an actual user problem. I've seen confusion about whether p.issubset(q) means that p is a subset of q or whether it means that q is a subset of p. That said, I think the confusion is intrinsic to the word order of the call itself. So there isn't much than can be done. Thankfully, people mostly just use the s < t notation which is unambigous.

@rhettinger rhettinger added the pending The issue will be closed if no feedback is provided label Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting core review pending The issue will be closed if no feedback is provided skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants