Skip to content

Commit 00b6687

Browse files
committed
Add DEP 1 proposal for more accurate usage detection
This usage detection strategy was under active implementation and it should come live in version 3 of Deadcode.
1 parent 6564b2d commit 00b6687

File tree

1 file changed

+165
-0
lines changed

1 file changed

+165
-0
lines changed

enhancement_proposals/dep_1.md

Lines changed: 165 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,165 @@
1+
# DEP 1 - More accurate usage detection
2+
**Status**: Proposed
3+
**Type**: Feature
4+
**Created**: 2024-01-14
5+
6+
## Motivation
7+
Usage detection can be more accurate.
8+
It is common in other unused code detection tools to simply construct
9+
sets of names used in definitions and names used in usage expressions.
10+
And reporting the names, which are in defined set, but not in usage one.
11+
12+
In larger projects cases when names overlap are quite common. In this example:
13+
```
14+
class Foo:
15+
def validate(self):
16+
pass
17+
18+
class Bar:
19+
def validate(self):
20+
pass
21+
22+
Foo().validate()
23+
```
24+
`validate` name would not be reported if this simple strategy is being used.
25+
However, it is possible to track the types of definitions and usages and
26+
more accurately identify the usages. This DEP 1 proposal specifies the strategy,
27+
which could be used to more accurately track the types of variables.
28+
29+
30+
## Scope construction for definitions:
31+
Dot-separated scope name is being used to identify a scope in each line.
32+
33+
```
34+
# foo.py # foo
35+
class Bar: # foo.Bar
36+
def spam(): # foo.Bar.spam
37+
pass # foo.Bar.spam
38+
```
39+
40+
If the `foo.py` file is not on the working path, then its scope is being prefixed with dot separated package names.
41+
For example, if the `foo.py` is in `ham.eggs` package, then the scope of `spam` method will be:
42+
`ham.eggs.foo.Bar.spam`.
43+
44+
45+
## Matching defined type with a type in an expression
46+
47+
Lets say we have a `Bar` type definition and an expression which uses it:
48+
49+
```
50+
# foo.py # SCOPE
51+
class Bar: # foo.Bar
52+
def spam(): # foo.Bar.spam
53+
pass # foo.Bar.spam
54+
Bar().spam() # foo, TYPE: Bar
55+
```
56+
57+
On the expression line `Bar().spam()` the scope is `foo`, the identified type name
58+
is `Bar`. This type `Bar` will be used to search for a type definition (CodeItem instance)
59+
in a namespace `foo` and usages will be marked on it as well as usages of related attributes/methods
60+
will be associated with that type.
61+
62+
63+
## Providing only a part of scope via options is fine
64+
All of these scopes will match the `spam` method:
65+
`ham.eggs.foo.Bar.spam`
66+
`foo.Bar.spam`
67+
`Bar.spam`
68+
`spam`
69+
70+
The less specific the scope is the more cases it will match,
71+
i.e. `spam` scope would also match a variable named `spam` in any scope as well.
72+
In some sence scopes like `spam`` have wild cards `*.spam` matching any scope prefixes.
73+
74+
75+
## Identifying types of scope parts
76+
When creating the scope parts, each part could also have the type
77+
(e.g. pacakge, module, class, method, variable) associated with it.
78+
When a usage expression is being detected its type could be searched by
79+
using types of scope part, instead of simply comparing scope strings.
80+
81+
For example this code snippet contains two different objects, which could be
82+
matched using a generic `foo.bar` scope:
83+
84+
```
85+
def foo:
86+
bar = 1
87+
88+
class foo:
89+
def bar(self):
90+
pass
91+
92+
foo().bar = 1
93+
```
94+
95+
Deadcode could internally track the type of each scope part and when an expression
96+
is being detected, the defined type could be searched by taking the types of scope parts, not only
97+
the scope string. For example, using a special notation like `>foo%bar` and `#foo>bar`
98+
for scopes could be used for the above example to accurately identify definitions.
99+
100+
User could also provide precise types of scope parts by using a different separator instead of `.`.
101+
These separators could be used for scope part separation:
102+
- `.` - means any type of scope
103+
- `/` - package or module scope
104+
- `#` - class scope
105+
- `>` - function or method scope
106+
- `%` - variable or variable attribute
107+
108+
For example, user could provide this `ham.eggs.foo.Bar.spam` path as well as a more specific one
109+
`ham/eggs/foo#Bar>spam` to exactly match the types of scope parts.
110+
111+
112+
## Type tracking for method arguments and returned values
113+
114+
### Tracking type of arguments
115+
When argument is being passed into a function/method the type remains the same, but the
116+
variable name might change, or it might be put into a container like tuple or dictionary.
117+
Deadcode will attempt to track the types of function/method parameters, however in some cases
118+
the type will be lost and deadcode will fallback to a generic name matching strategy.
119+
120+
In this example:
121+
122+
```
123+
class Foo:
124+
def bar(self):
125+
pass
126+
127+
def eggs(ham):
128+
ham.bar()
129+
130+
spam = Foo()
131+
eggs(spam)
132+
```
133+
134+
Deadcode will be able to accurately detect that type of `ham` is `Foo`.
135+
136+
137+
### Tracking types of returned values
138+
It might be hard to track exact types of variables, for example:
139+
140+
```
141+
clas Eggs:
142+
pass
143+
144+
class Bar:
145+
def spam():
146+
return Eggs()
147+
148+
variable = Bar().spam()
149+
print(variable)
150+
```
151+
152+
Parsing the returned type of `Bar.spam` is complicated.
153+
In some cases, the returned type might only be determined dynamically during a runtime
154+
and it might depend on method's implementation details.
155+
Hence, in some cases the types won't be identified
156+
due to runtime not being available during static code analysis.
157+
158+
The Deadcode policy on this is that when a type is being lost due to inability
159+
to accurately identify it.
160+
161+
In such cases the Deadcode will loose a way to accurately identify the type of variables/attributes.
162+
Hence a generic name matching will be used instead in these cases, just like vulture does.
163+
If more than one definition with the same name is detected the warning should be issues
164+
(if enough verbosity is enabled).
165+
In addition, type hints could be used to try to detect the type more easily in such cases.

0 commit comments

Comments
 (0)