Skip to content

Commit df30824

Browse files
authored
Simplify chain logic (#54)
* flatten begin blocks anywhere in the chain * adjust readme * remove redundant readme parts * add test * bump version * add changelog entry
1 parent f9fae78 commit df30824

File tree

5 files changed

+259
-160
lines changed

5 files changed

+259
-160
lines changed

Diff for: CHANGELOG.md

+47
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,50 @@
1+
# v0.6
2+
3+
**Breaking**: The rules for transforming chains were simplified.
4+
Before, there was the two-arg block syntax (this was the only syntax originally):
5+
6+
```julia
7+
@chain x begin
8+
y
9+
z
10+
end
11+
```
12+
13+
the inline syntax:
14+
15+
```julia
16+
@chain x y z
17+
```
18+
19+
and the one-arg block syntax:
20+
21+
```julia
22+
@chain begin
23+
x
24+
y
25+
z
26+
end
27+
```
28+
All of these are now a single syntax, derived from the rule that any `begin ... end` block in the inline syntax is flattened into its lines.
29+
This means that you can also use multiple `begin ... end` blocks, and they can be in any position, which can be nice for interactive development of a chain in the REPL.
30+
31+
```julia
32+
@chain x y begin
33+
x
34+
y
35+
z
36+
end u v w begin
37+
g
38+
h
39+
i
40+
end
41+
```
42+
43+
This is only breaking if you were using a `begin ... end` block in the inline syntax at argument 3 or higher, but you also had to be using an underscore without chaining in that begin block, which is deemed quite unlikely given the intended use of the package.
44+
All "normal" usage of the `@chain` macro should work as it did before.
45+
46+
As another consequence of the refactor, chains now do not error anymore for a single argument form `@chain x` but simply return `x`.
47+
148
# v0.5
249

350
**Breaking**: The `@chain` macro now creates a `begin` block, not a `let` block.

Diff for: Project.toml

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
name = "Chain"
22
uuid = "8be319e6-bccf-4806-a6f7-6fae938471bc"
33
authors = ["Julius Krumbiegel"]
4-
version = "0.5.0"
4+
version = "0.6.0"
55

66
[compat]
77
julia = "1"

Diff for: README.md

+87-92
Original file line numberDiff line numberDiff line change
@@ -67,131 +67,150 @@ end
6767

6868
## Summary
6969

70-
Chain.jl defines the `@chain` macro. It takes a start value and a `begin ... end` block of expressions.
70+
Chain.jl exports the `@chain` macro.
7171

72-
The result of each expression is fed into the next one using one of two rules:
72+
This macro rewrites a series of expressions into a chain, where the result of one expression
73+
is inserted into the next expression following certain rules.
7374

74-
1. **There is at least one underscore in the expression**
75-
- every `_` is replaced with the result of the previous expression
76-
2. **There is no underscore**
77-
- the result of the previous expression is used as the first argument in the current expression, as long as it is a function call, a macro call or a symbol representing a function.
75+
**Rule 1**
7876

79-
Lines that are prefaced with `@aside` are executed, but their result is not fed into the next pipeline step.
80-
This is very useful to inspect pipeline state during debugging, for example.
77+
Any `expr` that is a `begin ... end` block is flattened.
78+
For example, these two pseudocodes are equivalent:
8179

82-
## Motivation
80+
```julia
81+
@chain a b c d e f
8382

84-
- The implicit first argument insertion is useful for many data pipeline scenarios, like `groupby`, `transform` and `combine` in DataFrames.jl
85-
- The `_` syntax is there to either increase legibility or to use functions like `filter` or `map` which need the previous result as the second argument
86-
- There is no need to type `|>` over and over
87-
- Any line can be commented out or in without breaking syntax, there is no problem with dangling `|>` symbols
88-
- The state of the pipeline can easily be checked with the `@aside` macro
89-
- The `begin ... end` block marks very clearly where the macro is applied and works well with auto-indentation
90-
- Because everything is just lines with separate expressions and not one huge function call, IDEs can show exactly in which line errors happened
91-
- Pipe is a name defined by Base Julia which can lead to conflicts
83+
@chain a begin
84+
b
85+
c
86+
d
87+
end e f
88+
```
9289

93-
## Example
90+
**Rule 2**
9491

95-
An example with a DataFrame:
92+
Any expression but the first (in the flattened representation) will have the preceding result
93+
inserted as its first argument, unless at least one underscore `_` is present.
94+
In that case, all underscores will be replaced with the preceding result.
9695

97-
```julia
98-
using DataFrames, Chain
96+
If the expression is a symbol, the symbol is treated equivalently to a function call.
9997

100-
df = DataFrame(group = [1, 2, 1, 2, missing], weight = [1, 3, 5, 7, missing])
98+
For example, the following code block
10199

102-
result = @chain df begin
103-
dropmissing
104-
filter(r -> r.weight < 6, _)
105-
groupby(:group)
106-
combine(:weight => sum => :total_weight)
100+
```julia
101+
@chain begin
102+
x
103+
f()
104+
@g()
105+
h
106+
@i
107+
j(123, _)
108+
k(_, 123, _)
107109
end
108110
```
109111

110-
The chain block is equivalent to this:
112+
is equivalent to
111113

112114
```julia
113-
result = begin
114-
local var"##1" = dropmissing(df)
115-
local var"##2" = filter(r -> r.weight < 6, var"##1")
116-
local var"##3" = groupby(var"##2", :group)
117-
local var"##4" = combine(var"##3", :weight => sum => :total_weight)
115+
begin
116+
local temp1 = f(x)
117+
local temp2 = @g(temp1)
118+
local temp3 = h(temp2)
119+
local temp4 = @i(temp3)
120+
local temp5 = j(123, temp4)
121+
local temp6 = k(temp5, 123, temp5)
118122
end
119123
```
120124

121-
## Alternative one-argument syntax
125+
**Rule 3**
122126

123-
If your initial argument name is long and / or the chain's result is assigned to a long variable, it can look cleaner if the initial value is moved into the chain.
124-
Here is such a long expression:
127+
An expression that begins with `@aside` does not pass its result on to the following expression.
128+
Instead, the result of the previous expression will be passed on.
129+
This is meant for inspecting the state of the chain.
130+
The expression within `@aside` will not get the previous result auto-inserted, you can use
131+
underscores to reference it.
125132

126133
```julia
127-
a_long_result_variable_name = @chain a_long_input_variable_name begin
128-
do_something
129-
do_something_else(parameter)
130-
do_other_thing(parameter, _)
134+
@chain begin
135+
[1, 2, 3]
136+
filter(isodd, _)
137+
@aside @info "There are \$(length(_)) elements after filtering"
138+
sum
131139
end
132140
```
133141

134-
This is equivalent to the following expression:
142+
**Rule 4**
143+
144+
It is allowed to start an expression with a variable assignment.
145+
In this case, the usual insertion rules apply to the right-hand side of that assignment.
146+
This can be used to store intermediate results.
135147

136148
```julia
137-
a_long_result_variable_name = @chain begin
138-
a_long_input_variable_name
139-
do_something
140-
do_something_else(parameter)
141-
do_other_thing(parameter, _)
149+
@chain begin
150+
[1, 2, 3]
151+
filtered = filter(isodd, _)
152+
sum
142153
end
154+
155+
filtered == [1, 3]
143156
```
144157

145-
## One-liner syntax
158+
**Rule 5**
146159

147-
You can also use `@chain` as a one-liner, where no begin-end block is necessary.
148-
This works well for short sequences that are still easy to parse visually without being on separate lines.
160+
The `@.` macro may be used with a symbol to broadcast that function over the preceding result.
149161

150162
```julia
151-
@chain 1:10 filter(isodd, _) sum sqrt
163+
@chain begin
164+
[1, 2, 3]
165+
@. sqrt
166+
end
152167
```
153168

154-
## Variable assignments in the chain
155-
156-
You can prefix any of the expressions that Chain.jl can handle with a variable assignment.
157-
The previous value will be spliced into the right-hand-side expression and the result will be available afterwards under the chosen variable name.
169+
is equivalent to
158170

159171
```julia
160-
@chain 1:10 begin
161-
_ * 3
162-
filtered = filter(iseven, _)
163-
sum
172+
@chain begin
173+
[1, 2, 3]
174+
sqrt.(_)
164175
end
165-
166-
filtered == [6, 12, 18, 24, 30]
167176
```
168177

169-
## The `@aside` macro
170178

171-
For debugging, it's often useful to look at values in the middle of a pipeline.
172-
You can use the `@aside` macro to mark expressions that should not pass on their result.
173-
For these expressions there is no implicit first argument spliced in if there is no `_`, because that would be impractical for most purposes.
179+
## Motivation
174180

175-
If for example, we wanted to know how many groups were created after step 3, we could do this:
181+
- The implicit first argument insertion is useful for many data pipeline scenarios, like `groupby`, `transform` and `combine` in DataFrames.jl
182+
- The `_` syntax is there to either increase legibility or to use functions like `filter` or `map` which need the previous result as the second argument
183+
- There is no need to type `|>` over and over
184+
- Any line can be commented out or in without breaking syntax, there is no problem with dangling `|>` symbols
185+
- The state of the pipeline can easily be checked with the `@aside` macro
186+
- Flattening of `begin ... end` blocks allows you to split your chain over multiple lines
187+
- Because everything is just lines with separate expressions and not one huge function call, IDEs can show exactly in which line errors happened
188+
- Pipe is a name defined by Base Julia which can lead to conflicts
189+
190+
## Example
191+
192+
An example with a DataFrame:
176193

177194
```julia
195+
using DataFrames, Chain
196+
197+
df = DataFrame(group = [1, 2, 1, 2, missing], weight = [1, 3, 5, 7, missing])
198+
178199
result = @chain df begin
179200
dropmissing
180201
filter(r -> r.weight < 6, _)
181202
groupby(:group)
182-
@aside println("There are $(length(_)) groups after step 3.")
183203
combine(:weight => sum => :total_weight)
184204
end
185205
```
186206

187-
Which is again equivalent to this:
207+
The chain block is equivalent to this:
188208

189209
```julia
190210
result = begin
191211
local var"##1" = dropmissing(df)
192212
local var"##2" = filter(r -> r.weight < 6, var"##1")
193213
local var"##3" = groupby(var"##2", :group)
194-
println("There are $(length(var"##3")) groups after step 3.")
195214
local var"##4" = combine(var"##3", :weight => sum => :total_weight)
196215
end
197216
```
@@ -214,27 +233,3 @@ You can use this, for example, in combination with the `@aside` macro if you nee
214233
combine(:weight => sum => :total_weight)
215234
end
216235
```
217-
218-
## Rewriting Rules
219-
220-
Here is a list of equivalent expressions, where `_` is replaced by `prev` and the new variable is `next`.
221-
In reality, each new variable simply gets a new name via `gensym`, which is guaranteed not to conflict with anything else.
222-
223-
| **Before** | **After** | **Comment** |
224-
| :-- | :-- | :-- |
225-
| `sum` | `next = sum(prev)` | Symbol gets expanded into function call |
226-
| `sum()` | `next = sum(prev)` | First argument is inserted |
227-
| `sum(_)` | `next = sum(prev)` | Call expression gets `_` replaced |
228-
| `_ + 3` | `next = prev + 3` | Infix call expressions work the same way as other calls |
229-
| `+(3)` | `next = prev + 3` | Infix notation with _ would look better, but this is also possible |
230-
| `1 + 2` | `next = prev + 1 + 2` | This might feel weird, but `1 + 2` is a normal call expression |
231-
| `filter(isodd, _)` | `next = filter(isodd, prev)` | Underscore can go anywhere |
232-
| `@aside println(_)` | `println(prev)` | `println` without affecting the pipeline; using `_` |
233-
| `@aside println("hello")` | `println("hello")` | `println` without affecting the pipeline; no implicit first arg |
234-
| `@. sin` | `next = sin.(prev)` | Special-cased alternative to `sin.()` |
235-
| `sin.()` | `next = sin.(prev)` | First argument is prepended for broadcast calls as well |
236-
| `somefunc.(x)` | `next = somefunc.(prev, x)` | First argument is prepended for broadcast calls as well |
237-
| `@somemacro` | `next = @somemacro(prev)` | Macro calls without arguments get an argument spliced in |
238-
| `@somemacro(x)` | `next = @somemacro(prev, x)` | First argument splicing is the same as with functions |
239-
| `@somemacro(x, _)` | `next = @somemacro(x, prev)` | Also underscore behavior |
240-

0 commit comments

Comments
 (0)