You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _episodes/06-free-text.md
+28-2Lines changed: 28 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -147,11 +147,37 @@ The third part uses `uniq`, another new command, in combination with the `-c` fl
147
147
The fourth and final part sorts the text again by the counts of duplicates generated in step three.
148
148
149
149
> ## Challenge
150
-
> There are still some remaining punctuation in the text. They are called 'smart' or 'curly' quotes.
150
+
> There are still some remaining punctuation marks in the text. They are called 'smart' or 'curly' quotes.
151
151
> Can you remove them using `sed`?
152
152
>
153
+
> Hint: These quote marks are not among the 128 characters of the ASCII standard,
154
+
> so in the file they are encoded using a different standard, UTF-8.
155
+
> While this is no problem for `sed`, the window you are typing into may not understand UTF-8.
156
+
> If so you will need to use a Bash script; we encountered these at the end of episode 4,
157
+
> 'Automating the tedious with loops'.
158
+
>
159
+
> As a reminder, use the text editor of your choice to write a file that looks like this:
160
+
> > ```
161
+
> > #!/bin/bash
162
+
> > # This script removes quote marks from gulliver-clean.txt and saves the result as gulliver-noquotes.txt
163
+
> > (replace this line with your solution)
164
+
> > ```
165
+
> > {: .bash}
166
+
> Save the file as `remove-quotes.sh` and run it from the command line like this:
167
+
> > ```
168
+
> > bash remove-quotes.sh
169
+
> > ```
170
+
> > {: .bash}
171
+
>
153
172
> > ## Solution
154
-
> > This allows us to do some bug fixing and search the internet for answers either using 'sed smart quotes' or 'sed curly quotes' as our keywords to start.
173
+
> > ```
174
+
> > #!/bin/bash
175
+
> > # This script removes quote marks from gulliver-clean.txt and saves the result as gulliver-noquotes.txt
176
+
> > sed -Ee 's/[“”‘’]//g' gulliver-clean.txt > gulliver-noquotes.txt
177
+
> > ```
178
+
> > {: .bash}
179
+
> > If this doesn't work for you, you might need to check whether your text editor can
0 commit comments