Skip to content

Commit 82389b9

Browse files
authored
Merge pull request #12 from ncclementi/master
Solutions of exercises 1, 2, 3 class 02/18/17
2 parents f1e9980 + 36e7542 commit 82389b9

File tree

3 files changed

+364
-0
lines changed

3 files changed

+364
-0
lines changed

2017-02/18/exercise1_solution.py

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
"""
2+
Introduction to Python
3+
DATE - 18 February 2017
4+
Instructor - Nathan Danielsen @nate_somewhere
5+
https://github.com/ndanielsen/beginning-python/
6+
"""
7+
8+
"""
9+
Exercise 1:
10+
11+
From the following header extract:
12+
1. Extract the title
13+
2. Extract the introduction
14+
3. Extract the author
15+
4. Extract the photographer
16+
17+
*Hint*: Google something called repr and check what `print(repr(header))`
18+
tells you.
19+
20+
Note: For a guided and step by step solution explanation in jupyter notebooks go to:
21+
https://github.com/ncclementi/dcpython_exercises/blob/master/02-18-17_application_real_data.ipynb
22+
23+
You will also find useful links and some extra tips to try in your script.
24+
"""
25+
26+
header = """
27+
My President Was Black
28+
A history of the first African American White House—and of what came next
29+
30+
By Ta-Nehisi Coates
31+
Photograph by Ian Allen
32+
"""
33+
34+
#Using the hint:
35+
#Uncomment the following line to see the output of repr(header)
36+
37+
#print(repr(header))
38+
39+
#Splitting and saving to a list.
40+
header_list = header.split('\n')
41+
42+
#Removing extra white spaces in each element of our list.
43+
for i in range(len(header_list)):
44+
header_list[i] = header_list[i].strip()
45+
46+
#Removing empty ('') elements.
47+
header_list = list(filter(lambda item: item!='' , header_list))
48+
49+
#Getting title and introduction.
50+
title = header_list[0]
51+
intro = header_list[1]
52+
53+
#Getting author and photographer.
54+
#Here we strip out the part we don't want.
55+
author = header_list[2].strip('By ')
56+
photographer = header_list[3].strip('Photograph by ')
57+
58+
#Print information required
59+
print('Title : ', title)
60+
print('Introduction: ', intro)
61+
print('Author : ', author)
62+
print('Photographer: ', photographer)
63+
64+

2017-02/18/exercise2_solution.py

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
"""
2+
Introduction to Python
3+
DATE - 18 February 2017
4+
Instructor - Nathan Danielsen @nate_somewhere
5+
https://github.com/ndanielsen/beginning-python/
6+
"""
7+
8+
"""
9+
Exercise 1:
10+
11+
Using the first paragraph:
12+
1. How many words are in the first paragraph?
13+
2. How many sentences are in the first paragraph?
14+
3. How many times is the word 'Obama' is in the first paragraph?
15+
4. If you remove all of the punctuation and lower case all text, how many words?
16+
17+
18+
Note: For a guided and step by step solution explanation in jupyter notebooks go to:
19+
https://github.com/ncclementi/dcpython_exercises/blob/master/02-18-17_application_real_data.ipynb
20+
21+
You will also find useful links and some extra tips to try in your script.
22+
"""
23+
24+
first_paragraph = """
25+
In the waning days of President Barack Obama’s administration, he and his wife, Michelle, hosted a farewell party, the full import of which no one could then grasp. It was late October, Friday the 21st, and the president had spent many of the previous weeks, as he would spend the two subsequent weeks, campaigning for the Democratic presidential nominee, Hillary Clinton. Things were looking up. Polls in the crucial states of Virginia and Pennsylvania showed Clinton with solid advantages. The formidable GOP strongholds of Georgia and Texas were said to be under threat. The moment seemed to buoy Obama. He had been light on his feet in these last few weeks, cracking jokes at the expense of Republican opponents and laughing off hecklers. At a rally in Orlando on October 28, he greeted a student who would be introducing him by dancing toward her and then noting that the song playing over the loudspeakers—the Gap Band’s “Outstanding”—was older than she was. “This is classic!” he said. Then he flashed the smile that had launched America’s first black presidency, and started dancing again. Three months still remained before Inauguration Day, but staffers had already begun to count down the days. They did this with a mix of pride and longing—like college seniors in early May. They had no sense of the world they were graduating into. None of us did.
26+
"""
27+
28+
#Uncomment the following line to see the output of repr(header)
29+
#print(repr(header))
30+
31+
# Part 1: How many words are in the first paragraph?
32+
33+
#Splitting and saving to a list.
34+
paragraph_list = first_paragraph.split()
35+
36+
#Uncomment to print how it looks our split_paragraph list
37+
#print(paragraph_list)
38+
39+
40+
#We want to keep first_paragraph without changes, so we create a copy and work with that.
41+
revised_paragraph = first_paragraph
42+
43+
#Remove "em dashes" (unicode: (Ctrl+Shift+u)+2014)
44+
for element in revised_paragraph:
45+
if element == '—':
46+
revised_paragraph = revised_paragraph.replace(element, ' ')
47+
48+
#Uncomment to print how it looks our revised_paragraph.
49+
#print(revised_paragraph)
50+
51+
#Split into words and save to list.
52+
words_list = revised_paragraph.split()
53+
54+
#Count words using len()
55+
words = len(words_list)
56+
57+
#Uncomment next print line to print here the amount of words.
58+
#We will print all the information asked at the end too.
59+
60+
#print('The amount of words in first_paragraph is: ', words)
61+
62+
63+
# Part 2: How many sentences are in the first paragraph?
64+
65+
#We want to keep first_paragraph without changes, so we create a copy, in this
66+
#case the copy will have replce ? by . and also \n by '' .
67+
68+
#First replace ? by .
69+
sentence_paragraph = first_paragraph.replace('?', '.')
70+
#Now replace \n by ''
71+
sentence_paragraph = sentence_paragraph.replace('\n', '')
72+
73+
#Split the paragraph into sentences
74+
sentence_list = sentence_paragraph.split('.')
75+
76+
#Uncomment to print how it looks our sentence_list
77+
#print(paragraph_list)
78+
79+
#Let's remove the '' elements
80+
sentence_list = list(filter(lambda item: item!='' , sentence_list))
81+
82+
#Now our sentences_list just contains the sentences, let's use len() to count
83+
# the amount of sentences.
84+
sentences = len(sentence_list)
85+
86+
#Uncomment next print line to print here the amount of sentences.
87+
#We will print all the information asked at the end too.
88+
89+
#print('The amount of sentences in first_paragraph is: ', sentences)
90+
91+
#Part 3 : How many times is the word 'Obama' is in the first paragraph?
92+
#The following is a comment
93+
'''
94+
If you read the paragraph you might have notice that in some parts the Word
95+
Obama appears as "Obama's". We want to count that case; therefore, we will look
96+
for the string 'Obama' in wach word of the word_list. If the string is in the
97+
word we will add 1 to the variable obama_count (initialized in 0).
98+
'''
99+
#code
100+
obama_count = 0
101+
for word in words_list:
102+
if 'Obama' in word:
103+
obama_count +=1
104+
105+
#Uncomment next print line to print here the amount of sentences.
106+
#We will print all the information asked at the end too.
107+
108+
#print('The word Obama in first_paragraph appears {} times.'.format(obama_count))
109+
110+
# Part 4: If you remove all of the punctuation and lower case all text, how many words?
111+
112+
#Lower case the whole paragraph and save into lower_paragrpah
113+
lower_paragraph = first_paragraph.lower()
114+
115+
#Remove punctuation
116+
117+
#First we import the string constants available in python.
118+
import string
119+
120+
#Let's print the string punctuation (uncomment next line to print).
121+
#print(string.punctuation)
122+
123+
#loop in the character of string.punctuation and we replace the characters that appear
124+
#in our lower_paragraph for a space.
125+
for character in string.punctuation:
126+
lower_paragraph = lower_paragraph.replace(character, ' ')
127+
128+
#If you print the new lowe_paragraph you'll notice that we missed some characters
129+
#We will remove them too, for detail explanation go to the jupyter notebook link
130+
# located in the header at the top of this script.
131+
132+
more_punct = '’‘“”—\n'
133+
for character in more_punct:
134+
lower_paragraph = lower_paragraph.replace(character, ' ')
135+
136+
#With all the possible punctuation removed, now let's count words.
137+
no_punctuation_list = lower_paragraph.split()
138+
words_no_punctuation = len(no_punctuation_list)
139+
140+
#PRINTING ALL THE INFO REQUIRED TOGETHER:
141+
142+
print('The amount of words in first_paragraph is: ', words)
143+
print('The amount of sentences in first_paragraph is: ', sentences)
144+
print('The word Obama in first_paragraph appears {} times.'.format(obama_count))
145+
print('The amount of words in the paragraph with no punctuation is: ', words_no_punctuation)

2017-02/18/exercise3_solution.py

Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
"""
2+
Introduction to Python
3+
DATE - 18 February 2017
4+
Instructor - Nathan Danielsen @nate_somewhere
5+
https://github.com/ndanielsen/beginning-python/
6+
"""
7+
8+
"""
9+
## Eercise 3: Advanced
10+
11+
1. Open the article_part_one.txt with python.
12+
2. How many words are in part one?
13+
3. How many sentences are in part one?
14+
4. Which words follow 'black' and 'white' in the text? Which ones are used the most for each?
15+
16+
*Hint*: Google open file with python
17+
18+
Note: For a guided and step by step solution explanation in jupyter notebooks go to:
19+
https://github.com/ncclementi/dcpython_exercises/blob/master/02-18-17_application_real_data.ipynb
20+
21+
You will also find useful links and some extra tips to try in your script.
22+
23+
In this case because it was the advanced exercise, I introduce new things that
24+
we didin't see on the class. hope you have fun!
25+
"""
26+
27+
#Open the file:
28+
29+
with open('article_part_one.txt', 'r') as file:
30+
article = file.read()
31+
32+
#Uncomment following line to print article.
33+
#print(article)
34+
35+
36+
## How many words are in part one?
37+
38+
#First remove all punctuation to make the count easier.
39+
40+
import string
41+
42+
punctuation = string.punctuation
43+
#Some extra symbols that are in the text that don't appear in string.punctuatuon
44+
#You can create them using unicode (Check notebook for this)
45+
extra_punct = '’‘“”—\n'
46+
47+
all_punct = punctuation + extra_punct
48+
49+
#Now let's remove this from the text.
50+
51+
#We will modify article_no_punct but we want to keep intact article. So
52+
article_no_punct = article
53+
54+
for char in all_punct:
55+
if char in article_no_punct:
56+
article_no_punct = article_no_punct.replace(char, ' ')
57+
58+
#Split and count words
59+
words_list = article_no_punct.split()
60+
words_total = len(words_list)
61+
62+
#Uncomment this to print at this point
63+
#print('The total amount of words is: {}'.format(words_total))
64+
65+
# To have extra tips on count how many of each words do you have, check the notebook
66+
67+
68+
## How many sentences are in part one?
69+
70+
#We will modify article_sentences but we want to keep intact article. So
71+
article_sentences = article
72+
73+
#If you take a look at the article you might have noticed that the header
74+
#sentences have no '.' but there are break lines there, so we will replace
75+
#the '\n' for periods and then but split.
76+
77+
article_sentences = article_sentences.replace('\n','.')
78+
79+
#Split when periods to count sentences
80+
sentences_article_list = article_sentences.split('.')
81+
82+
83+
#Let's ommit the '' and the elements that are "sentences" that are actually 2
84+
# letters because they were part of an abbreviation. For example '— F' or '"'.
85+
#These elements have the particularity that their length is always smaller
86+
# than 3, so let's filter that.
87+
88+
list_clean = list(filter(lambda item: len(item)>3 , sentences_article_list))
89+
90+
sentence_total = len(list_clean)
91+
92+
#Uncomment this to print at this point
93+
#print('The total amount of sentences is: {}'.format(sentence_total))
94+
95+
96+
## Which words follow 'black' and 'white' in the text?
97+
## Which ones are used the most for each?
98+
99+
# Here we will use new stuffs, go to the notebook for links and explanations.
100+
101+
#Set all lower case so we just look for white/black.
102+
103+
words_lower = []
104+
for word in words_list:
105+
words_lower.append(word.lower())
106+
107+
#We want to know were in the list appears the word white and where the word
108+
# black. We look for those indices doing the following.
109+
110+
indx_white = [i for i, x in enumerate(words_lower) if x == "white"]
111+
indx_black = [i for i, x in enumerate(words_lower) if x == "black"]
112+
113+
#Looking for the words that follow white
114+
lst_white =[]
115+
for i in indx_white:
116+
lst_white.append(words_lower[i+1])
117+
118+
#Looking for the words that follow white
119+
lst_black =[]
120+
for i in indx_black:
121+
lst_black.append(words_lower[i+1])
122+
123+
#Let's count for each list the repetitions of each word using dictionaries
124+
#and the get method.
125+
follows_white = {}
126+
for word in lst_white:
127+
follows_white[word] = follows_white.get(word,0) + 1
128+
129+
follows_black = {}
130+
for word in lst_black:
131+
follows_black[word] = follows_black.get(word,0) + 1
132+
133+
#Uncomment the following 2 lines to see how the follows_white and follows_black
134+
#dictionaries look like.
135+
136+
#print(follows_white)
137+
#print(follows_black)
138+
139+
#Let's get the word in each dictionary that has the biggest value.
140+
141+
most_follow_white = max(follows_white, key=follows_white.get)
142+
most_follow_black = max(follows_black, key=follows_black.get)
143+
144+
#Uncomment next two lines to print las answer
145+
#print("The most used word that follows 'white' is: ",most_follow_white)
146+
#print("The most used word that follows 'black' is: ",most_follow_black)
147+
148+
149+
## PRINTING ALL THE ANSWERS:
150+
151+
print('The total amount of words is: {}'.format(words_total))
152+
print('The total amount of sentences is: {}'.format(sentence_total))
153+
print("The most used word that follows 'white' is: ",most_follow_white)
154+
print("The most used word that follows 'black' is: ",most_follow_black)
155+

0 commit comments

Comments
 (0)