Skip to content

Commit 07c9fe0

Browse files
authored
Merge pull request #141 from alex-ball/patch-9
Thanks @alex-ball. This is great. Closes #92
2 parents 0366d2d + 06ec389 commit 07c9fe0

File tree

1 file changed

+28
-2
lines changed

1 file changed

+28
-2
lines changed

_episodes/06-free-text.md

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -147,11 +147,37 @@ The third part uses `uniq`, another new command, in combination with the `-c` fl
147147
The fourth and final part sorts the text again by the counts of duplicates generated in step three.
148148

149149
> ## Challenge
150-
> There are still some remaining punctuation in the text. They are called 'smart' or 'curly' quotes.
150+
> There are still some remaining punctuation marks in the text. They are called 'smart' or 'curly' quotes.
151151
> Can you remove them using `sed`?
152152
>
153+
> Hint: These quote marks are not among the 128 characters of the ASCII standard,
154+
> so in the file they are encoded using a different standard, UTF-8.
155+
> While this is no problem for `sed`, the window you are typing into may not understand UTF-8.
156+
> If so you will need to use a Bash script; we encountered these at the end of episode 4,
157+
> 'Automating the tedious with loops'.
158+
>
159+
> As a reminder, use the text editor of your choice to write a file that looks like this:
160+
> > ```
161+
> > #!/bin/bash
162+
> > # This script removes quote marks from gulliver-clean.txt and saves the result as gulliver-noquotes.txt
163+
> > (replace this line with your solution)
164+
> > ```
165+
> > {: .bash}
166+
> Save the file as `remove-quotes.sh` and run it from the command line like this:
167+
> > ```
168+
> > bash remove-quotes.sh
169+
> > ```
170+
> > {: .bash}
171+
>
153172
> > ## Solution
154-
> > This allows us to do some bug fixing and search the internet for answers either using 'sed smart quotes' or 'sed curly quotes' as our keywords to start.
173+
> > ```
174+
> > #!/bin/bash
175+
> > # This script removes quote marks from gulliver-clean.txt and saves the result as gulliver-noquotes.txt
176+
> > sed -Ee 's/[“”‘’]//g' gulliver-clean.txt > gulliver-noquotes.txt
177+
> > ```
178+
> > {: .bash}
179+
> > If this doesn't work for you, you might need to check whether your text editor can
180+
> > save files using the UTF-8 encoding.
155181
> {: .solution}
156182
{: .challenge}
157183

0 commit comments

Comments
 (0)