Skip to content

Commit 0a46c31

Browse files
zkamvarCarpentries Apprentice
authored and
Carpentries Apprentice
committed
[custom] fix lesson contents
1 parent 5cf8b91 commit 0a46c31

File tree

3 files changed

+7
-19
lines changed

3 files changed

+7
-19
lines changed

episodes/02-match-extract-strings.md

+4-14
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ exercises: 30
2121

2222
For this exercise, open a browser and go to [https://regex101.com](https://regex101.com). Regex101.com is a free regular expression debugger with real time explanation, error detection, and highlighting.
2323

24-
Open the [swcCoC.md file](https://github.com/LibraryCarpentry/lc-data-intro/tree/gh-pages/data/swcCoC.md), copy the text, and paste that into the test string box.
24+
Open the [swcCoC.md file](https://github.com/LibraryCarpentry/lc-data-intro/tree/main/episodes/data/swcCoC.md), copy the text, and paste that into the test string box.
2525

2626
For a quick test to see if it is working, type the string `community` into the regular expression box.
2727

@@ -139,7 +139,7 @@ Find all of the words starting with Comm or comm that are plural.
139139

140140
For this exercise, open a browser and go to [https://regex101.com](https://regex101.com).
141141

142-
Open the [swcCoC.md file](https://github.com/LibraryCarpentry/lc-data-intro/tree/gh-pages/data/swcCoC.md), copy it, and paste it into the test string box.
142+
Open the [swcCoC.md file](https://github.com/LibraryCarpentry/lc-data-intro/tree/main/episodes/data/swcCoC.md), copy it, and paste it into the test string box.
143143

144144
::::::::::::::::::::::::::::::::::::::: challenge
145145

@@ -253,8 +253,6 @@ Start with what we know, which is the most basic format of a phone number: three
253253

254254
This expression should find three matches in the document.
255255

256-
257-
258256
:::::::::::::::::::::::::
259257

260258
::::::::::::::::::::::::::::::::::::::::::::::::::
@@ -285,8 +283,6 @@ Start with what we know, which is the most basic format of a phone number: three
285283

286284
This expression should find one match in the document
287285

288-
289-
290286
:::::::::::::::::::::::::
291287

292288
::::::::::::::::::::::::::::::::::::::::::::::::::
@@ -319,8 +315,6 @@ See the previous exercise for the explanation of the rest of the expression.
319315

320316
This expression should find two matches in the document.
321317

322-
323-
324318
:::::::::::::::::::::::::
325319

326320
::::::::::::::::::::::::::::::::::::::::::::::::::
@@ -351,8 +345,6 @@ See the previous exercise for the explanation of the rest of the expression.
351345

352346
This expression should find one match in the document.
353347

354-
355-
356348
:::::::::::::::::::::::::
357349

358350
::::::::::::::::::::::::::::::::::::::::::::::::::
@@ -361,7 +353,7 @@ This expression should find one match in the document.
361353

362354
### Using regular expressions when working with files and directories
363355

364-
One of the reasons we stress the value of consistent and predictable directory and filenaming conventions is that working in this way enables you to use the computer to select files based on the characteristics of their file names. For example, if you have a bunch of files where the first four digits are the year and you only want to do something with files from '2017', then you can. Or if you have 'journal' somewhere in a filename when you have data about journals, you can use the computer to select just those files. Equally, using plain text formats means that you can go further and select files or elements of files based on characteristics of the data *within* those files. See Workshop Overview: [File Naming \& Formatting](https://librarycarpentry.org/lc-overview/06-file-naming-formatting/index.html) for further background.
356+
One of the reasons we stress the value of consistent and predictable directory and filenaming conventions is that working in this way enables you to use the computer to select files based on the characteristics of their file names. For example, if you have a bunch of files where the first four digits are the year and you only want to do something with files from '2017', then you can. Or if you have 'journal' somewhere in a filename when you have data about journals, you can use the computer to select just those files. Equally, using plain text formats means that you can go further and select files or elements of files based on characteristics of the data *within* those files. See Workshop Overview: [File Naming \& Formatting](https://librarycarpentry.org/lc-overview/06-file-naming-formatting) for further background.
365357

366358
::::::::::::::::::::::::::::::::::::::::::::::::::
367359

@@ -371,7 +363,7 @@ One of the reasons we stress the value of consistent and predictable directory a
371363

372364
### Extracting a substring in Google Sheets using regex
373365

374-
1. Export and unzip the [2017 Public Library Survey](https://github.com/LibraryCarpentry/lc-data-intro/blob/gh-pages/files/PLS_FY17.zip) (originally from the IMLS data site) as a CSV file.
366+
1. Export and unzip the [2017 Public Library Survey](https://github.com/LibraryCarpentry/lc-data-intro/blob/main/episodes/files/PLS_FY17.zip) (originally from the IMLS data site) as a CSV file.
375367
2. Upload the CSV file to Google Sheets and open as a Google Sheet if it does not do this by default.
376368
3. Look in the `ADDRESS` column and notice that the values contain the latitude and longitude in parenthesis after the library address.
377369
4. Construct a regular expression to match and extract the latitude and longitude into a new column named 'latlong'. HINT: Look up the function `REGEXEXTRACT` in Google Sheets. That function expects the first argument to be a string (a cell in `ADDRESS` column) and a quoted regular expression in the second.
@@ -388,8 +380,6 @@ This is one way to solve this challenge. You might have found others. Inside the
388380

389381
Latitude and longitude are in decimal degree format and can be positive or negative, so we start with an optional dash for negative values then use `\d+` for a one or more digit match followed by a period `\.`. Note we had to escape the period using `\`. After the period we look for one or more digits `\d+` again followed by a literal comma `,`. We then have a literal space match followed by an optional dash `-` (there are few `0.0` latitude/longitudes that are probably errors, but we'd want to retain so we can deal with them). We then repeat our `\d+\.\d+` we used for the latitude match.
390382

391-
392-
393383
:::::::::::::::::::::::::
394384

395385
::::::::::::::::::::::::::::::::::::::::::::::::::

index.md

+1-3
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,7 @@ This Library Carpentry lesson introduces people with library- and information-re
99

1010
## Teaching this lesson
1111

12-
This lesson is taught either as a combination with the episodes [Jargon Busting](https://librarycarpentry.org/lc-overview/03-jargon-busting/index.html) and [A Computational Approach](https://librarycarpentry.org/lc-overview/04-computational-approach/index.html) in [Workshop Overview](https://librarycarpentry.org/lc-overview/) (with the possibility of adding optional episodes from Workshop Overview), as part of a self-organised mix-and-match training, or separately as an individual lesson.
13-
12+
This lesson is taught either as a combination with the episodes [Jargon Busting](https://librarycarpentry.org/lc-overview/03-jargon-busting) and [A Computational Approach](https://librarycarpentry.org/lc-overview/04-computational-approach) in [Workshop Overview](https://librarycarpentry.org/lc-overview/) (with the possibility of adding optional episodes from Workshop Overview), as part of a self-organised mix-and-match training, or separately as an individual lesson.
1413

1514
::::::::::::::::::::::::::::::::::::::::::::::::::
1615

@@ -20,7 +19,6 @@ This lesson is taught either as a combination with the episodes [Jargon Busting]
2019

2120
This lesson has no prerequisites. You will need a laptop and an internet connection to complete Episode 2.
2221

23-
2422
::::::::::::::::::::::::::::::::::::::::::::::::::
2523

2624

instructors/instructor-notes.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,11 @@ To make a handout for this lesson, adapt/print from [https://librarycarpentry.or
1818

1919
To teach regular expressions, instructors have used:
2020

21-
- [slides](https://github.com/LibraryCarpentry/lc-data-intro/blob/gh-pages/files/regexslides.pdf) to quiz the audience on examples.
21+
- [slides](https://github.com/LibraryCarpentry/lc-data-intro/blob/main/episodes/files/regexslides.pdf) to quiz the audience on examples.
2222
- Pen and paper, to work through exercises before using a tool and to explain that there can be multiple answers to the same question.
2323
- Whiteboard with text examples and quized participants on regex approaches.
2424
- Online tools such as: [Regxr](https://regexr.com/), [regex101](https://regex101.com/), [rexegper](https://regexper.com/), [myregexp](https://myregexp.com/), or whichever service you prefer.
25-
- Used quiz/exercise files in [https://github.com/LibraryCarpentry/lc-data-intro/tree/gh-pages/files](https://github.com/LibraryCarpentry/lc-data-intro/tree/gh-pages/files).
25+
- Used quiz/exercise files in [episodes/files](https://github.com/LibraryCarpentry/lc-data-intro/tree/main/episodes/files).
2626

2727
General guidance:
2828

0 commit comments

Comments
 (0)