|
| 1 | +# Extended Mode |
| 2 | + |
| 3 | +This allows for more complex pattern and rules than the basic functionality. |
| 4 | +It adds support for multiline and partial line rules. |
| 5 | + |
| 6 | +It can be configured to only make modifications before the first non-matching code, or throughout the code. |
| 7 | +In either scenario, it will return 10 lines of code. |
| 8 | + |
| 9 | +## Usage |
| 10 | + |
| 11 | +To enable these more advanced rules, add `!e` to the beginning of your language's config file. |
| 12 | + |
| 13 | +For example: |
| 14 | + |
| 15 | +```txt |
| 16 | +!e |
| 17 | +
|
| 18 | +rule... |
| 19 | +rule.. |
| 20 | +``` |
| 21 | + |
| 22 | +### Flags |
| 23 | + |
| 24 | +You can pass extra flags at the beginning of the file to enable specific functionality. |
| 25 | + |
| 26 | +There is currently one extra flag: `stop_at_first_loc`. |
| 27 | + |
| 28 | +This flag forces the parse to immediately return the next 10 lines of code once it finds any line that is not stripped out. |
| 29 | +By default, the extended parser will try to remove all comments from your code, but for many languages this produces too many false strips and so is problematic. |
| 30 | + |
| 31 | +To use this, add `stop_at_first_loc` after the `!e`: |
| 32 | + |
| 33 | +```txt |
| 34 | +!e stop_at_first_loc |
| 35 | +
|
| 36 | +rule... |
| 37 | +rule... |
| 38 | +``` |
| 39 | + |
| 40 | +### Rules |
| 41 | + |
| 42 | +As with the basic version, one rule is specified per line. |
| 43 | +The extended version adds several modifiers that can be used to specify more advanced behaviour. |
| 44 | + |
| 45 | +Unlike the basic version rules apply to anywhere within the line, not just a character at the start. |
| 46 | +For example, the rule `#` would change the following: |
| 47 | + |
| 48 | +```ruby |
| 49 | +# Some comment |
| 50 | +10 + 20 # Some other comment |
| 51 | +``` |
| 52 | + |
| 53 | +to: |
| 54 | + |
| 55 | +```ruby |
| 56 | +# Some comment |
| 57 | +Some |
| 58 | +``` |
| 59 | + |
| 60 | +This can be problematic (consider that `#` is used for interpolation in Ruby), which is why you may want to use the `stop_at_first_loc` flag explained above. |
| 61 | + |
| 62 | +### Basic Modifiers |
| 63 | + |
| 64 | +The first set of rules handle how a token is found and dealt with. |
| 65 | + |
| 66 | +As seen above, the most basic rule is adding a string that you want to match. |
| 67 | +This will look for that string seperated from others by whitespace or line delimiters, and remove the matching string and anything that follows. |
| 68 | + |
| 69 | +For example, given a config file such as... |
| 70 | + |
| 71 | +``` |
| 72 | +!e |
| 73 | +' |
| 74 | +``` |
| 75 | + |
| 76 | +... and a snippet such as this... |
| 77 | + |
| 78 | +``` |
| 79 | +' Delete me |
| 80 | +Please ' delete me |
| 81 | +I'm not deletable |
| 82 | +``` |
| 83 | + |
| 84 | +... we would get the following output: |
| 85 | + |
| 86 | +``` |
| 87 | +Please |
| 88 | +I'm not deletable |
| 89 | +``` |
| 90 | + |
| 91 | +To match without these whitespace restrictions we can use the `\p` modifier. |
| 92 | +Changing the config to... |
| 93 | + |
| 94 | +``` |
| 95 | +!e |
| 96 | +'\p |
| 97 | +``` |
| 98 | + |
| 99 | +would then result in: |
| 100 | + |
| 101 | +``` |
| 102 | +Please |
| 103 | +I |
| 104 | +``` |
| 105 | + |
| 106 | +We can also choose to only remove the offending chars/strings using the `\j` modifier. |
| 107 | +Changing the config to... |
| 108 | + |
| 109 | +``` |
| 110 | +!e |
| 111 | +'\j |
| 112 | +``` |
| 113 | + |
| 114 | +would result in: |
| 115 | + |
| 116 | +``` |
| 117 | +Delete me |
| 118 | +Please delete me |
| 119 | +I'm not deletable |
| 120 | +``` |
| 121 | + |
| 122 | +And we can chain those modifiers together with config such as: |
| 123 | + |
| 124 | +``` |
| 125 | +!e |
| 126 | +'\pj |
| 127 | +``` |
| 128 | + |
| 129 | +Which would result in (note the change in the bottom line): |
| 130 | + |
| 131 | +``` |
| 132 | +Delete me |
| 133 | +Please delete me |
| 134 | +Im not deletable |
| 135 | +``` |
| 136 | + |
| 137 | +This table summarises the examples above: |
| 138 | + |
| 139 | +| Rule | Requires whitespace | Removes subsequent chars | |
| 140 | +| ------ | ------------------- | ------------------------ | |
| 141 | +| `'` | Yes | Yes | |
| 142 | +| `'\p` | No | Yes | |
| 143 | +| `'\j` | Yes | No | |
| 144 | +| `'\pj` | No | No | |
| 145 | + |
| 146 | +### Repeating Modifier |
| 147 | + |
| 148 | +You can use an `+` after a character to mark it for `2..n` repetition. |
| 149 | +This is useful for comment rules where a certain amount of symbols are allowed before another. |
| 150 | + |
| 151 | +For example, if you want to match `/*****/` in Java, you could use the rule: |
| 152 | + |
| 153 | +```text |
| 154 | +!e |
| 155 | +/*+/ |
| 156 | +``` |
| 157 | + |
| 158 | +Or in nim, if you wanted to match `###[...` you could use |
| 159 | + |
| 160 | +```text |
| 161 | +!e |
| 162 | +#+[\p |
| 163 | +``` |
| 164 | + |
| 165 | +These rules do **not** match the single version (they are `2..n`, not `1..n`) so please specify an extra explicit rule for the single scenario if needed. |
| 166 | + |
| 167 | +### Multiline Magic Modifier |
| 168 | + |
| 169 | +The real power of all these rules comes when the multiline modifer is added. |
| 170 | + |
| 171 | +The multiline modifer is `-->>`. |
| 172 | +It can be added between two rules to mark the rule as multiline. |
| 173 | +All the text between the two rules will be skipped, plus all the text the end rule would skip normally. |
| 174 | + |
| 175 | +For example... |
| 176 | + |
| 177 | +``` |
| 178 | +!e |
| 179 | +/*\p-->>*\p |
| 180 | +``` |
| 181 | + |
| 182 | +would remove: |
| 183 | + |
| 184 | +```csharp |
| 185 | +/* This is a nice |
| 186 | +mutliline |
| 187 | +comment */ |
| 188 | +``` |
| 189 | + |
| 190 | +By combining with the `\j` flag, This works great for lines that have trailing characters too. |
| 191 | +For example, adding th `\j` to the above rule rule... |
| 192 | + |
| 193 | +``` |
| 194 | +!e |
| 195 | +/*\p-->>*\pj |
| 196 | +``` |
| 197 | + |
| 198 | +...with this code... |
| 199 | + |
| 200 | +```javascript |
| 201 | +/* This is a nice |
| 202 | +mutliline |
| 203 | +comment */ const n = 15; |
| 204 | +``` |
| 205 | + |
| 206 | +would give us: |
| 207 | + |
| 208 | +```javascript |
| 209 | +const n = 15; |
| 210 | +``` |
| 211 | + |
| 212 | +Rules precedence is calculated by earliest match. |
| 213 | +This means that whole word rules usually have higher precedence, because they start matching from the preceding space. |
| 214 | + |
| 215 | +## Examples |
| 216 | + |
| 217 | +Each example has three blocks: |
| 218 | + |
| 219 | +- Rules |
| 220 | +- Input |
| 221 | +- Ouput |
| 222 | + |
| 223 | +### Example 1 |
| 224 | + |
| 225 | +``` |
| 226 | +!e |
| 227 | +`/_\p-->>?\*/\pj` |
| 228 | +``` |
| 229 | + |
| 230 | +``` |
| 231 | +/*Some comment I wanted to add in case |
| 232 | +that someone wants to read it*/def solve(data): |
| 233 | +``` |
| 234 | + |
| 235 | +``` |
| 236 | +def solve(data): |
| 237 | +``` |
| 238 | + |
| 239 | +### Example 2 |
| 240 | + |
| 241 | +``` |
| 242 | +!e |
| 243 | +`import-->>from` |
| 244 | +``` |
| 245 | + |
| 246 | +``` |
| 247 | +import { |
| 248 | + a, |
| 249 | + b, |
| 250 | + other |
| 251 | +} |
| 252 | +from 'example'; |
| 253 | +
|
| 254 | +class Foobar... |
| 255 | +``` |
| 256 | + |
| 257 | +``` |
| 258 | +class Foobar... |
| 259 | +``` |
| 260 | + |
| 261 | +### Example 2 |
| 262 | + |
| 263 | +``` |
| 264 | +!e |
| 265 | +`#+[-->>]#+` |
| 266 | +``` |
| 267 | + |
| 268 | +``` |
| 269 | +#[ |
| 270 | + Doc |
| 271 | +]# |
| 272 | +###[More Doc]### |
| 273 | +
|
| 274 | +class Foobar... |
| 275 | +``` |
| 276 | + |
| 277 | +``` |
| 278 | +class Foobar... |
| 279 | +``` |
| 280 | + |
| 281 | +## FAQs |
| 282 | + |
| 283 | +### Can I use a literal `\` |
| 284 | + |
| 285 | +Yes. `'\` is a valid rule and is the same as `'`. |
| 286 | +For example: `\'\\` that will match with the string `\'\` |
| 287 | + |
| 288 | +## Current limitations |
| 289 | + |
| 290 | +All of these are open to future improvements if a track needs it, until we have the representers to back this up. |
| 291 | + |
| 292 | +- Tabs are quirky. |
| 293 | + If you use a multiline comment which ends before a line which has actual code, the tabs will be ignored and the line will seem like wrongly indented. |
| 294 | + This is hard to fix without clotting a lot more the code and not something that will happen often enough to justify the work. |
| 295 | +- As a rule of thumb, any rule clash will be not allowed unless there is a way to be totally sure of which one was intended. |
| 296 | +- For any rules whose actual symbol / start actual symbol coincide, they'll be allowed if: |
| 297 | + - Simple rules: they need to use the same action (skip line, or skip just) |
| 298 | + - Multiline rules, they need to have the same start action. Their end rules will then be merged into the multiline end syntax tree |
| 299 | +- A token repeat character at the beginning will throw an exception. There is nothing to repeat. |
| 300 | + |
| 301 | +## Improvements |
| 302 | + |
| 303 | +- We are able to support arguments supplying them after the `!e` in the first line. |
| 304 | + For example, not limiting ourselves to 10 lines. It might be useful to add extra meta functionality for specific tracks. |
| 305 | +- Repeat character rule is quirky. |
| 306 | + Is the most painful in the code, but is needed for some languages. |
| 307 | + Leave it at is? Not? Disable it? Yada yada? Open to suggestions. |
| 308 | +- Should we rstrip all saved lines of code? |
| 309 | + Right now I wanted to be safe and save them. |
| 310 | + It also had as a bonus that I could correctly asses matches and skipped in what I was expecting of. |
| 311 | + In the site they will be invisible, but I dunno. |
| 312 | + Opinions? |
| 313 | +- Adding optional token syntax. |
| 314 | + It should avoid having to deal with many options for declaring something. |
| 315 | + Something like `@moduledoc [''',"""]\pj-->>[''',"""]\pj` but finding suitable and not conflicting delimiters could be hard. |
| 316 | + Implementation would be simple enough, just explode the options into all possible rules and then flat map the result of the rule parser on it. |
| 317 | + Which characters would make it easy and not conflicting to implement this? |
| 318 | +- Add an option to specify an unmatchable rule that can be used to skip until the end of file in multiline rules. |
| 319 | + |
| 320 | +## Closing thoughts |
| 321 | + |
| 322 | +If you find something missing, please open an issue so we can check its inclusion. |
| 323 | + |
| 324 | +**The representers will end substituting these parsers in the final launch, so please think of this as a "best effort"** |
0 commit comments