Merge pull request #12 from ncclementi/master

ndanielsen · web-flow · commit 82389b998d57 · 2018-07-08T15:08:43.000-07:00
Solutions of exercises 1, 2, 3 class 02/18/17
diff --git a/2017-02/18/exercise1_solution.py b/2017-02/18/exercise1_solution.py
@@ -0,0 +1,64 @@
+"""
+Introduction to Python
+DATE - 18 February 2017
+Instructor - Nathan Danielsen @nate_somewhere
+https://github.com/ndanielsen/beginning-python/
+"""
+
+"""
+Exercise 1:
+
+From the following header extract:
+1. Extract the title
+2. Extract the introduction
+3. Extract the author
+4. Extract the photographer
+
+*Hint*: Google something called repr and check what `print(repr(header))`
+tells you.
+
+Note: For a guided and step by step solution explanation in jupyter notebooks go to:
+https://github.com/ncclementi/dcpython_exercises/blob/master/02-18-17_application_real_data.ipynb
+
+You will also find useful links and some extra tips to try in your script.
+"""
+
+header = """
+    My President Was Black
+    A history of the first African American White House—and of what came next
+
+    By Ta-Nehisi Coates
+    Photograph by Ian Allen
+    """
+
+#Using the hint:
+#Uncomment the following line to see the output of repr(header)
+
+#print(repr(header))
+
+#Splitting and saving to a list.
+header_list = header.split('\n')
+
+#Removing extra white spaces in each element of our list.
+for i in range(len(header_list)):
+    header_list[i] = header_list[i].strip()
+
+#Removing empty ('') elements.
+header_list = list(filter(lambda item: item!='' , header_list))  
+
+#Getting title and introduction.
+title = header_list[0]
+intro = header_list[1]
+
+#Getting author and photographer.
+#Here we strip out the part we don't want.
+author = header_list[2].strip('By ')
+photographer = header_list[3].strip('Photograph by ')
+
+#Print information required
+print('Title       : ', title)
+print('Introduction: ', intro)
+print('Author      : ', author)
+print('Photographer: ', photographer)
+
+
diff --git a/2017-02/18/exercise2_solution.py b/2017-02/18/exercise2_solution.py
@@ -0,0 +1,145 @@
+"""
+Introduction to Python
+DATE - 18 February 2017
+Instructor - Nathan Danielsen @nate_somewhere
+https://github.com/ndanielsen/beginning-python/
+"""
+
+"""
+Exercise 1:
+
+Using the first paragraph:
+1. How many words are in the first paragraph?
+2. How many sentences are in the first paragraph?
+3. How many times is the word 'Obama' is in the first paragraph?
+4. If you remove all of the punctuation and lower case all text, how many words?
+
+
+Note: For a guided and step by step solution explanation in jupyter notebooks go to:
+https://github.com/ncclementi/dcpython_exercises/blob/master/02-18-17_application_real_data.ipynb
+
+You will also find useful links and some extra tips to try in your script.
+"""
+
+first_paragraph = """
+In the waning days of President Barack Obama’s administration, he and his wife, Michelle, hosted a farewell party, the full import of which no one could then grasp. It was late October, Friday the 21st, and the president had spent many of the previous weeks, as he would spend the two subsequent weeks, campaigning for the Democratic presidential nominee, Hillary Clinton. Things were looking up. Polls in the crucial states of Virginia and Pennsylvania showed Clinton with solid advantages. The formidable GOP strongholds of Georgia and Texas were said to be under threat. The moment seemed to buoy Obama. He had been light on his feet in these last few weeks, cracking jokes at the expense of Republican opponents and laughing off hecklers. At a rally in Orlando on October 28, he greeted a student who would be introducing him by dancing toward her and then noting that the song playing over the loudspeakers—the Gap Band’s “Outstanding”—was older than she was. “This is classic!” he said. Then he flashed the smile that had launched America’s first black presidency, and started dancing again. Three months still remained before Inauguration Day, but staffers had already begun to count down the days. They did this with a mix of pride and longing—like college seniors in early May. They had no sense of the world they were graduating into. None of us did.
+"""
+
+#Uncomment the following line to see the output of repr(header)
+#print(repr(header))
+
+# Part 1: How many words are in the first paragraph?
+
+#Splitting and saving to a list.
+paragraph_list = first_paragraph.split()
+
+#Uncomment to print how it looks our split_paragraph list
+#print(paragraph_list)
+
+
+#We want to keep first_paragraph without changes, so we create a copy and work with that.
+revised_paragraph = first_paragraph
+
+#Remove "em dashes" (unicode: (Ctrl+Shift+u)+2014)
+for element in revised_paragraph:
+    if element == '—':
+        revised_paragraph = revised_paragraph.replace(element, ' ')
+
+#Uncomment to print how it looks our revised_paragraph.
+#print(revised_paragraph)
+
+#Split into words and save to list.
+words_list = revised_paragraph.split()
+
+#Count words using len()
+words = len(words_list)
+
+#Uncomment next print line to print here the amount of words.
+#We will print all the information asked at the end too. 
+
+#print('The amount of words in first_paragraph is: ', words)
+
+
+# Part 2: How many sentences are in the first paragraph?
+
+#We want to keep first_paragraph without changes, so we create a copy, in this
+#case the copy will have replce  ? by . and also \n by '' . 
+
+#First replace ? by .
+sentence_paragraph = first_paragraph.replace('?', '.')
+#Now replace \n by ''
+sentence_paragraph = sentence_paragraph.replace('\n', '')
+
+#Split the paragraph into sentences
+sentence_list = sentence_paragraph.split('.')
+
+#Uncomment to print how it looks our sentence_list
+#print(paragraph_list)
+
+#Let's remove the '' elements
+sentence_list = list(filter(lambda item: item!='' , sentence_list))
+
+#Now our sentences_list just contains the sentences, let's use len() to count
+# the amount of sentences.
+sentences = len(sentence_list)
+
+#Uncomment next print line to print here the amount of sentences.
+#We will print all the information asked at the end too. 
+
+#print('The amount of sentences in first_paragraph is: ', sentences)
+
+#Part 3 : How many times is the word 'Obama' is in the first paragraph?
+#The following is a comment
+'''
+If you read the paragraph you might have notice that in some parts the Word
+Obama appears as "Obama's". We want to count that case; therefore, we will look
+for the string 'Obama' in wach word of the word_list. If the string is in the
+word we will add 1 to the variable obama_count (initialized in 0). 
+'''
+#code
+obama_count = 0
+for word in words_list:
+    if 'Obama' in word:
+        obama_count +=1   
+
+#Uncomment next print line to print here the amount of sentences.
+#We will print all the information asked at the end too. 
+
+#print('The word Obama in first_paragraph appears {} times.'.format(obama_count))
+
+# Part 4: If you remove all of the punctuation and lower case all text, how many words?
+
+#Lower case the whole paragraph and save into lower_paragrpah
+lower_paragraph = first_paragraph.lower()
+
+#Remove punctuation
+
+#First we import the string constants available in python.  
+import string
+
+#Let's print the string punctuation (uncomment next line to print).
+#print(string.punctuation)
+
+#loop in the character of string.punctuation and we replace the characters that appear
+#in our lower_paragraph for a space. 
+for character in string.punctuation:
+    lower_paragraph = lower_paragraph.replace(character, ' ')
+
+#If you print the new lowe_paragraph you'll notice that we missed some characters
+#We will remove them too, for detail explanation go to the jupyter notebook link
+# located in the header at the top of this script.  
+
+more_punct = '’‘“”—\n'
+for character in more_punct:
+    lower_paragraph = lower_paragraph.replace(character, ' ')
+
+#With all the possible punctuation removed, now let's count words.
+no_punctuation_list = lower_paragraph.split()
+words_no_punctuation = len(no_punctuation_list)
+
+#PRINTING ALL THE INFO REQUIRED TOGETHER:
+
+print('The amount of words in first_paragraph is: ', words)
+print('The amount of sentences in first_paragraph is: ', sentences)
+print('The word Obama in first_paragraph appears {} times.'.format(obama_count))
+print('The amount of words in the paragraph with no punctuation is: ', words_no_punctuation)  
diff --git a/2017-02/18/exercise3_solution.py b/2017-02/18/exercise3_solution.py
@@ -0,0 +1,155 @@
+"""
+Introduction to Python
+DATE - 18 February 2017
+Instructor - Nathan Danielsen @nate_somewhere
+https://github.com/ndanielsen/beginning-python/
+"""
+
+"""
+## Eercise 3: Advanced
+
+1. Open the article_part_one.txt with python.
+2. How many words are in part one?
+3. How many sentences are in part one?
+4. Which words follow 'black' and 'white' in the text? Which ones are used the most for each?
+
+*Hint*: Google open file with python
+
+Note: For a guided and step by step solution explanation in jupyter notebooks go to:
+https://github.com/ncclementi/dcpython_exercises/blob/master/02-18-17_application_real_data.ipynb
+
+You will also find useful links and some extra tips to try in your script.
+
+In this case because it was the advanced exercise, I introduce new things that 
+we didin't see on the class. hope you have fun!
+"""
+
+#Open the file:
+
+with open('article_part_one.txt', 'r') as file:
+    article = file.read()
+
+#Uncomment following line to print article.
+#print(article)
+
+
+## How many words are in part one?
+
+#First remove all punctuation to make the count easier. 
+
+import string
+
+punctuation = string.punctuation
+#Some extra symbols that are in the text that don't appear in string.punctuatuon
+#You can create them using unicode (Check notebook for this) 
+extra_punct = '’‘“”—\n'
+
+all_punct = punctuation + extra_punct
+
+#Now let's remove this from the text.
+
+#We will modify article_no_punct but we want to keep intact article. So
+article_no_punct = article
+
+for char in all_punct:
+    if char in article_no_punct:
+        article_no_punct = article_no_punct.replace(char, ' ') 
+
+#Split and count words
+words_list = article_no_punct.split()
+words_total = len(words_list)
+
+#Uncomment this to print at this point
+#print('The total amount of words is: {}'.format(words_total))
+
+# To have extra tips on count how many of each words do you have, check the notebook
+
+
+## How many sentences are in part one?
+
+#We will modify article_sentences but we want to keep intact article. So
+article_sentences = article
+
+#If you take a look at the article you might have noticed that the header
+#sentences have no '.' but there are break lines there, so we will replace 
+#the '\n' for periods and then but split. 
+
+article_sentences = article_sentences.replace('\n','.')
+
+#Split when periods to count sentences
+sentences_article_list = article_sentences.split('.')
+
+
+#Let's ommit the '' and the elements that are "sentences" that are actually 2
+# letters because they were part of an abbreviation. For example '— F' or '"'. 
+#These elements have the particularity that their length is always smaller
+# than 3, so let's filter that.
+
+list_clean = list(filter(lambda item: len(item)>3 , sentences_article_list))
+
+sentence_total = len(list_clean)
+
+#Uncomment this to print at this point
+#print('The total amount of sentences is: {}'.format(sentence_total))
+
+
+## Which words follow 'black' and 'white' in the text? 
+## Which ones are used the most for each?
+
+# Here we will use new stuffs, go to the notebook for links and explanations.
+
+#Set all lower case so we just look for white/black.
+
+words_lower = []
+for word in words_list:
+    words_lower.append(word.lower())
+
+#We want to know were in the list appears the word white and where the word
+# black. We look for those indices doing the following.
+
+indx_white  = [i for i, x in enumerate(words_lower) if x == "white"]
+indx_black  = [i for i, x in enumerate(words_lower) if x == "black"]
+
+#Looking for the words that follow white
+lst_white =[]
+for i in indx_white:
+    lst_white.append(words_lower[i+1])
+
+#Looking for the words that follow white
+lst_black =[]
+for i in indx_black:
+    lst_black.append(words_lower[i+1])
+
+#Let's count for each list the repetitions of each word using dictionaries 
+#and the get method. 
+follows_white = {}
+for word in lst_white:
+        follows_white[word] = follows_white.get(word,0) + 1
+
+follows_black = {}
+for word in lst_black:
+        follows_black[word] = follows_black.get(word,0) + 1
+
+#Uncomment the following 2 lines to see how the follows_white and follows_black
+#dictionaries look like.
+
+#print(follows_white)
+#print(follows_black)
+
+#Let's get the word in each dictionary that has the biggest value. 
+
+most_follow_white = max(follows_white, key=follows_white.get)
+most_follow_black = max(follows_black, key=follows_black.get)
+
+#Uncomment next two lines to print las answer
+#print("The most used word that follows 'white' is: ",most_follow_white)
+#print("The most used word that follows 'black' is: ",most_follow_black)
+
+
+## PRINTING ALL THE ANSWERS:
+
+print('The total amount of words is: {}'.format(words_total))
+print('The total amount of sentences is: {}'.format(sentence_total))
+print("The most used word that follows 'white' is: ",most_follow_white)
+print("The most used word that follows 'black' is: ",most_follow_black)
+