Skip to content

Commit 94a0119

Browse files
committed
cEP 23: Separation of bears' metadata
Closes #138
1 parent c1a4b7e commit 94a0119

File tree

1 file changed

+373
-0
lines changed

1 file changed

+373
-0
lines changed

cEP-0023.md

+373
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,373 @@
1+
# Separation of bears' metadata
2+
3+
| Metadata | |
4+
|----------|-----------------------------------------------|
5+
| cEP | 0023 |
6+
| Version | 1.0 |
7+
| Title | Separation of bears' metadata |
8+
| Authors | Muhammad Kaisar Arkhan <mailto:[email protected]> |
9+
| Status | Proposed |
10+
| Type | Feature |
11+
12+
## Abstract
13+
14+
This cEP proposes a method of separating bears' metadata and separating the
15+
usage of Python when writing bears.
16+
17+
## How bears are written currently
18+
19+
Most bears are composed of Python boilerplate code containing the needed
20+
metadata by coala, some more metadata to identify what a bear is, and docstrings
21+
for the bear description.
22+
23+
[GoVetBear][GoVetBear]
24+
25+
Of course not all bears are just boilerplate code. Some require Python code to
26+
help coala execute the linters, parse logs, make configuration files, etc.
27+
28+
[CoffeeLintBear][CoffeeLintBear]
29+
30+
Some bears are made locally by the coala team.
31+
32+
[SpaceConsistencyBear][SpaceConsistencyBear]
33+
34+
## Problems with current way of writing bears
35+
36+
1. Duplicate code all over the place
37+
38+
This makes it annoying when introducing a new feature that deprecates the old
39+
methods.
40+
41+
When writing bears, You have to get the Python boilerplate and put fancy
42+
metadata.
43+
44+
When a new feature that deprecates the old way of doing things, we have to
45+
change almost every bear code.
46+
47+
[Example 1][Example 1]
48+
49+
2. Python is not needed
50+
51+
Bears such as [GoVetBear][GoVetBear] don't need Python to declare metadata.
52+
53+
The usage of `@linter` decorator helps supressing a lot of boilerplate code
54+
but it still have the issue of having to use Python to just declare metadata.
55+
56+
Some projects/orgs may need to write their own bear so coala can use their
57+
exclusive tools (such as commerical code safety checks that are commonly used
58+
by embedded software projects).
59+
60+
Not all projects/organization want snippets of Python code in their projects
61+
just to simply declare on how to use the linter and not everyone can write
62+
Python.
63+
64+
3. Development is slow
65+
66+
This is specific to bears that are made in-house or require a lot of fancy
67+
code to run.
68+
69+
When writing a bear, we have to test them.
70+
71+
This require setting up coala development in your environment, making sure
72+
coala-bears isn't installed or declare the bears directory which may result
73+
in a conflict, run coala with a long list of arguments or just make a
74+
`.coafile`.
75+
76+
or do the other way around, write the tests first and just run `py.test` to
77+
test your fresh new bear.
78+
79+
Either way, both of them add a lot of time to just test a bear when
80+
development. You don't need to write a lot of unneccesary boilerplate code to
81+
just run the bear ad-hoc. It should be a simple as running them in your
82+
shell.
83+
84+
4. Dual functionality of bears
85+
86+
Are bears linters or are they just metadata to instruct coala to run linters?
87+
88+
Should bears just declare metadata and have the code that make it coala-able
89+
separated?
90+
91+
This has been an issue for a while and it generates inconsistencies all over
92+
the place.
93+
94+
Some bears have needy code to generate configuration files such as
95+
[CoffeeLintBear][CoffeeLintBear].
96+
97+
Some bears just put their code into themselves such as
98+
[SpaceConsistencyBear][SpaceConsistencyBear].
99+
100+
Some of the Python bears just call the functions such as
101+
[PEP8Bear][PEP8Bear].
102+
103+
I believe bears should be simply metadata while the actual linter tool should
104+
be seperated from them.
105+
106+
Needy code such as generating config files can easily be tasked into an
107+
external script.
108+
109+
5. Dependency Hell
110+
111+
Tracking coala and coala-bears has been a problem. coala and coala-bears must
112+
be released together and releases are quite slow because coala need a lot of
113+
changes while bears should be able to be released soon.
114+
115+
This holds back a lot of new bears and bug fixes.
116+
117+
coala-bears should have a steady and often release cycle so people can enjoy
118+
bug fixes and new bears without coala development holding them back.
119+
120+
Sadly this is a hard thing to do because coala-bears is a bunch of Python
121+
code that are calling things from coala that may or may not be there.
122+
123+
This creates a dependency cycle from both coala and coala-bears that should
124+
not be ignored.
125+
126+
6. Security
127+
128+
When declaring bears code inside the context of the coala process, it is
129+
possible to intorduce bugs that have access to the coala process.
130+
131+
This is bad since it is possible to leak information and possible gain code
132+
execution which makes it possible in theory for services such as continuous
133+
integration or have a specific usage of coala to be exploited and leak
134+
information such as secret keys for deployment like the Play Store.
135+
136+
coala should simply run linters in a seperated manner. It should not run
137+
them inside the same context.
138+
139+
If we treat bears as simply just metadata, it will help implementation of
140+
good secure practices such as privilege separation, operating system
141+
specific mitigations, and many more possible and way easier.
142+
143+
## Objective
144+
145+
coala-bears can be simplified by order of magnitude if it was treated as a
146+
repository filled with metadata to instruct coala on how to use linters.
147+
coala-bears should operate independently of coala development enabling a faster
148+
release cycle and deliver bug fixes and new bears faster.
149+
150+
## Structure of Bears
151+
152+
Collection of bears will be put inside a directory that are declared in
153+
`$COALA_BEAR_PATH` with defaults such as
154+
`$HOME/.coala/bears:/usr/local/lib/coala/bears:/usr/lib/coala/bears` in addition
155+
to a possible local `.coala` directory inside the project where bears are
156+
located inside `.coala/bears`.
157+
158+
```
159+
/usr/local/lib/coala/bears
160+
|
161+
|_ GoVetBear
162+
| |_ metadata.toml
163+
|
164+
|_ CoffeeLintBear
165+
| |_ metadata.toml
166+
| |_ bear.py
167+
|
168+
|_ SpaceConsistencyBear
169+
| |_ metadata.toml
170+
| |_ bear.py
171+
|
172+
|_ PEP8Bear
173+
| |_ metadata.toml
174+
| |_ bear.py
175+
...
176+
177+
.coala/bears
178+
|_ AeroplaneSafetyComplianceBear
179+
| |_ metadata.toml
180+
|
181+
|_ MemoryStructureFormatBear
182+
|_ metadata.toml
183+
|_ check_memory_structure.sh
184+
```
185+
186+
The `metadata.toml` file will declare the metadata required to instruct coala on
187+
how to use the tool, what arguments to give when executing, what dependencies
188+
required, etc.
189+
190+
Inside the folder, a script or an executable can be added seperating the need of
191+
coala when executing thus removing the dependency cycle.
192+
193+
The script will be launched as a general fork+exec model to prevent the script
194+
from doing malicious things inside the context of coala.
195+
196+
Enabling coala itself to do more safety features such as implementing operating
197+
system specific safety features (FreeBSD Capscicum, OpenBSD pledge, Linux
198+
SECCOMP, etc) and have a more fine-grained priviledge separation, however those
199+
aren't part of this cEP and will be covered in another time.
200+
201+
## `metadata.toml`
202+
203+
`metadata.toml` is essentially a TOML file declaring the needed information for
204+
coala.
205+
206+
TOML is chosen since it has enough features to do what we want. We may need to
207+
research on ini files are good enough since those are already inside Python's
208+
standard library.
209+
210+
Here are a couple of examples:
211+
212+
**GoVetBear/metadata.toml**
213+
```toml
214+
[identity]
215+
name = "GoVetBear"
216+
description = """\
217+
Analyze Go code and raise suspicious constructs, such as printf calls \
218+
whose arguments do not correctly match the format string, useless \
219+
assignments, common mistakes about boolean operations, unreachable code, \
220+
etc.\
221+
"""
222+
languages = ["Go"]
223+
authors = ["The coala developers"]
224+
authors_email = ["[email protected]"]
225+
license = "AGPL-3.0"
226+
can_detect = ["Unused code", "Smell", "Unreachable Code"]
227+
228+
[[requirements]]
229+
type = "AnyOneOf"
230+
231+
[[requirements.child]]
232+
type = "binary"
233+
name = "go"
234+
235+
[[requirements.child]]
236+
type = "apt"
237+
name = "golang"
238+
239+
[[requirements]]
240+
type = "GoRequirement"
241+
package = "golang.org/cmd/vet"
242+
flag = "-u"
243+
244+
[run]
245+
executable = "go"
246+
arguments = "vet"
247+
use_stdout = false
248+
use_stderr = true
249+
output_format = "regex"
250+
output_regex = ".+:(?P<line>\d+): (?P<message>.*)"
251+
```
252+
253+
**SpaceConsistencyBear/metadata.toml**
254+
```
255+
[identity]
256+
name = "SpaceConsistencyBear"
257+
description = """\
258+
Check and correct spacing for all textual data. This includes usage of \
259+
tabs vs. spaces, trailing whitespace and (missing) newlines before \
260+
the end of the file.\
261+
"""
262+
languages = ["All"]
263+
authors = ["The coala developers"]
264+
authors_email = ["[email protected]"]
265+
license = "AGPL-3.0"
266+
can_detect = ["Formatting"]
267+
268+
[[params]]
269+
name = "use_spaces"
270+
description = "True if spaces are to be used instead of tabs."
271+
type = "bool"
272+
273+
[[params]]
274+
name = "allow_trailing_whitespace"
275+
description = "Whether to allow trailing whitespace or not."
276+
type = "bool"
277+
default = false
278+
279+
[[params]]
280+
name = "indent_size"
281+
description = "Number of spaces per indentation level"
282+
type = "int"
283+
default = 8
284+
285+
[[params]]
286+
name = "enforce_newline_at_EOF"
287+
description = "Whether to enforce a newline at the end of file"
288+
type = "bool"
289+
default = true
290+
format="enforce-newline={}"
291+
292+
[run]
293+
executable = "bear.py"
294+
local = true
295+
use_coala_logging_style = true
296+
```
297+
298+
As you can see from SpaceConsistencyBear example, It is treated not as a Python
299+
code running under coala but rather if it was it's own linter. The `local`
300+
variable is simply to indicate the file is inside the directory and not in
301+
`$PATH` and `use_coala_logging_style` variable to tell coala that it's going to
302+
use the common log format.
303+
304+
Parameters will be given to the process via command arguments when launching.
305+
With the defaults of the above example it will result in the following command
306+
to execute:
307+
308+
```sh
309+
/usr/local/lib/coala/bears/general/SpaceConsistencyBear/bear.py \
310+
--allow_trailing_whitespace=false \
311+
--indent_size=8 \
312+
enforce-newline=true
313+
```
314+
315+
The above example is formatted for reading, the real command will be in one
316+
line.
317+
318+
**CoffeeLintBear/metadata.toml**
319+
```
320+
[identity]
321+
name = "CoffeeLintBear"
322+
description = "Check CoffeeScript for a clean and consistent file"
323+
url = "http://www.coffeelint.org"
324+
languages = ["CoffeeScript"]
325+
authors = ["The coala developers"]
326+
authors_email = ["[email protected]"]
327+
license = "AGPL-3.0"
328+
can_detect = ["Syntax", "Formatting", "Smell", "Complexity", "Duplication"]
329+
330+
[severity_map]
331+
normal = "warn"
332+
major = "error"
333+
info = "ignore"
334+
335+
[[requirements]]
336+
type = "binary"
337+
name = "coffeelint"
338+
339+
[[params]]
340+
name = "max_line_length"
341+
description = "Maximum number of characters per line."
342+
type = "int"
343+
default = 79
344+
345+
...
346+
347+
[run]
348+
executable = "bear.py"
349+
local = true
350+
use_coala_logging_style = true
351+
```
352+
353+
CoffeeLintBear example above indicates how the metadata will look like if it
354+
requires special treatment such as generating configuration files and
355+
translating the output of the linter.
356+
357+
If it just need to generate a config file, a simple `prerun` section can be
358+
added to be executed before the linter itself.
359+
360+
If it require some special treatment after the linter is executed, a `postrun`
361+
section can be added as well.
362+
363+
`prerun` and `postrun` section will have the same format as the `run` section.
364+
365+
## Process
366+
367+
TODO
368+
369+
[GoVetBear]: https://github.com/coala/coala-bears/blob/3cb9b148adc0dda51ac890188b38fd968f6058fd/bears/go/GoVetBear.py
370+
[CoffeeLintBear]: https://github.com/coala/coala-bears/blob/3cb9b148adc0dda51ac890188b38fd968f6058fd/bears/coffee_script/CoffeeLintBear.py
371+
[SpaceConsistencyBear]: https://github.com/coala/coala-bears/blob/3cb9b148adc0dda51ac890188b38fd968f6058fd/bears/general/SpaceConsistencyBear.py
372+
[PEP8Bear]: https://github.com/coala/coala-bears/blob/c5a5e201a42c44c159b9c118b062417e4ae4b17f/bears/python/PEP8Bear.py
373+
[Example 1]: https://github.com/coala/coala-bears/commit/3cb9b148adc0dda51ac890188b38fd968f6058fd

0 commit comments

Comments
 (0)