-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add DEP 1 proposal for more accurate usage detection
This usage detection strategy was under active implementation and it should come live in version 3 of Deadcode.
- Loading branch information
Showing
1 changed file
with
165 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
# DEP 1 - More accurate usage detection | ||
**Status**: Proposed | ||
**Type**: Feature | ||
**Created**: 2024-01-14 | ||
|
||
## Motivation | ||
Usage detection can be more accurate. | ||
It is common in other unused code detection tools to simply construct | ||
sets of names used in definitions and names used in usage expressions. | ||
And reporting the names, which are in defined set, but not in usage one. | ||
|
||
In larger projects cases when names overlap are quite common. In this example: | ||
``` | ||
class Foo: | ||
def validate(self): | ||
pass | ||
class Bar: | ||
def validate(self): | ||
pass | ||
Foo().validate() | ||
``` | ||
`validate` name would not be reported if this simple strategy is being used. | ||
However, it is possible to track the types of definitions and usages and | ||
more accurately identify the usages. This DEP 1 proposal specifies the strategy, | ||
which could be used to more accurately track the types of variables. | ||
|
||
|
||
## Scope construction for definitions: | ||
Dot-separated scope name is being used to identify a scope in each line. | ||
|
||
``` | ||
# foo.py # foo | ||
class Bar: # foo.Bar | ||
def spam(): # foo.Bar.spam | ||
pass # foo.Bar.spam | ||
``` | ||
|
||
If the `foo.py` file is not on the working path, then its scope is being prefixed with dot separated package names. | ||
For example, if the `foo.py` is in `ham.eggs` package, then the scope of `spam` method will be: | ||
`ham.eggs.foo.Bar.spam`. | ||
|
||
|
||
## Matching defined type with a type in an expression | ||
|
||
Lets say we have a `Bar` type definition and an expression which uses it: | ||
|
||
``` | ||
# foo.py # SCOPE | ||
class Bar: # foo.Bar | ||
def spam(): # foo.Bar.spam | ||
pass # foo.Bar.spam | ||
Bar().spam() # foo, TYPE: Bar | ||
``` | ||
|
||
On the expression line `Bar().spam()` the scope is `foo`, the identified type name | ||
is `Bar`. This type `Bar` will be used to search for a type definition (CodeItem instance) | ||
in a namespace `foo` and usages will be marked on it as well as usages of related attributes/methods | ||
will be associated with that type. | ||
|
||
|
||
## Providing only a part of scope via options is fine | ||
All of these scopes will match the `spam` method: | ||
`ham.eggs.foo.Bar.spam` | ||
`foo.Bar.spam` | ||
`Bar.spam` | ||
`spam` | ||
|
||
The less specific the scope is the more cases it will match, | ||
i.e. `spam` scope would also match a variable named `spam` in any scope as well. | ||
In some sence scopes like `spam`` have wild cards `*.spam` matching any scope prefixes. | ||
|
||
|
||
## Identifying types of scope parts | ||
When creating the scope parts, each part could also have the type | ||
(e.g. pacakge, module, class, method, variable) associated with it. | ||
When a usage expression is being detected its type could be searched by | ||
using types of scope part, instead of simply comparing scope strings. | ||
|
||
For example this code snippet contains two different objects, which could be | ||
matched using a generic `foo.bar` scope: | ||
|
||
``` | ||
def foo: | ||
bar = 1 | ||
class foo: | ||
def bar(self): | ||
pass | ||
foo().bar = 1 | ||
``` | ||
|
||
Deadcode could internally track the type of each scope part and when an expression | ||
is being detected, the defined type could be searched by taking the types of scope parts, not only | ||
the scope string. For example, using a special notation like `>foo%bar` and `#foo>bar` | ||
for scopes could be used for the above example to accurately identify definitions. | ||
|
||
User could also provide precise types of scope parts by using a different separator instead of `.`. | ||
These separators could be used for scope part separation: | ||
- `.` - means any type of scope | ||
- `/` - package or module scope | ||
- `#` - class scope | ||
- `>` - function or method scope | ||
- `%` - variable or variable attribute | ||
|
||
For example, user could provide this `ham.eggs.foo.Bar.spam` path as well as a more specific one | ||
`ham/eggs/foo#Bar>spam` to exactly match the types of scope parts. | ||
|
||
|
||
## Type tracking for method arguments and returned values | ||
|
||
### Tracking type of arguments | ||
When argument is being passed into a function/method the type remains the same, but the | ||
variable name might change, or it might be put into a container like tuple or dictionary. | ||
Deadcode will attempt to track the types of function/method parameters, however in some cases | ||
the type will be lost and deadcode will fallback to a generic name matching strategy. | ||
|
||
In this example: | ||
|
||
``` | ||
class Foo: | ||
def bar(self): | ||
pass | ||
def eggs(ham): | ||
ham.bar() | ||
spam = Foo() | ||
eggs(spam) | ||
``` | ||
|
||
Deadcode will be able to accurately detect that type of `ham` is `Foo`. | ||
|
||
|
||
### Tracking types of returned values | ||
It might be hard to track exact types of variables, for example: | ||
|
||
``` | ||
clas Eggs: | ||
pass | ||
class Bar: | ||
def spam(): | ||
return Eggs() | ||
variable = Bar().spam() | ||
print(variable) | ||
``` | ||
|
||
Parsing the returned type of `Bar.spam` is complicated. | ||
In some cases, the returned type might only be determined dynamically during a runtime | ||
and it might depend on method's implementation details. | ||
Hence, in some cases the types won't be identified | ||
due to runtime not being available during static code analysis. | ||
|
||
The Deadcode policy on this is that when a type is being lost due to inability | ||
to accurately identify it. | ||
|
||
In such cases the Deadcode will loose a way to accurately identify the type of variables/attributes. | ||
Hence a generic name matching will be used instead in these cases, just like vulture does. | ||
If more than one definition with the same name is detected the warning should be issues | ||
(if enough verbosity is enabled). | ||
In addition, type hints could be used to try to detect the type more easily in such cases. |