|
| 1 | +""" |
| 2 | +Introduction to Python |
| 3 | +DATE - 18 February 2017 |
| 4 | +Instructor - Nathan Danielsen @nate_somewhere |
| 5 | +https://github.com/ndanielsen/beginning-python/ |
| 6 | +""" |
| 7 | + |
| 8 | +""" |
| 9 | +Exercise 1: |
| 10 | +
|
| 11 | +Using the first paragraph: |
| 12 | +1. How many words are in the first paragraph? |
| 13 | +2. How many sentences are in the first paragraph? |
| 14 | +3. How many times is the word 'Obama' is in the first paragraph? |
| 15 | +4. If you remove all of the punctuation and lower case all text, how many words? |
| 16 | +
|
| 17 | +
|
| 18 | +Note: For a guided and step by step solution explanation in jupyter notebooks go to: |
| 19 | +https://github.com/ncclementi/dcpython_exercises/blob/master/02-18-17_application_real_data.ipynb |
| 20 | +
|
| 21 | +You will also find useful links and some extra tips to try in your script. |
| 22 | +""" |
| 23 | + |
| 24 | +first_paragraph = """ |
| 25 | +In the waning days of President Barack Obama’s administration, he and his wife, Michelle, hosted a farewell party, the full import of which no one could then grasp. It was late October, Friday the 21st, and the president had spent many of the previous weeks, as he would spend the two subsequent weeks, campaigning for the Democratic presidential nominee, Hillary Clinton. Things were looking up. Polls in the crucial states of Virginia and Pennsylvania showed Clinton with solid advantages. The formidable GOP strongholds of Georgia and Texas were said to be under threat. The moment seemed to buoy Obama. He had been light on his feet in these last few weeks, cracking jokes at the expense of Republican opponents and laughing off hecklers. At a rally in Orlando on October 28, he greeted a student who would be introducing him by dancing toward her and then noting that the song playing over the loudspeakers—the Gap Band’s “Outstanding”—was older than she was. “This is classic!” he said. Then he flashed the smile that had launched America’s first black presidency, and started dancing again. Three months still remained before Inauguration Day, but staffers had already begun to count down the days. They did this with a mix of pride and longing—like college seniors in early May. They had no sense of the world they were graduating into. None of us did. |
| 26 | +""" |
| 27 | + |
| 28 | +#Uncomment the following line to see the output of repr(header) |
| 29 | +#print(repr(header)) |
| 30 | + |
| 31 | +# Part 1: How many words are in the first paragraph? |
| 32 | + |
| 33 | +#Splitting and saving to a list. |
| 34 | +paragraph_list = first_paragraph.split() |
| 35 | + |
| 36 | +#Uncomment to print how it looks our split_paragraph list |
| 37 | +#print(paragraph_list) |
| 38 | + |
| 39 | + |
| 40 | +#We want to keep first_paragraph without changes, so we create a copy and work with that. |
| 41 | +revised_paragraph = first_paragraph |
| 42 | + |
| 43 | +#Remove "em dashes" (unicode: (Ctrl+Shift+u)+2014) |
| 44 | +for element in revised_paragraph: |
| 45 | + if element == '—': |
| 46 | + revised_paragraph = revised_paragraph.replace(element, ' ') |
| 47 | + |
| 48 | +#Uncomment to print how it looks our revised_paragraph. |
| 49 | +#print(revised_paragraph) |
| 50 | + |
| 51 | +#Split into words and save to list. |
| 52 | +words_list = revised_paragraph.split() |
| 53 | + |
| 54 | +#Count words using len() |
| 55 | +words = len(words_list) |
| 56 | + |
| 57 | +#Uncomment next print line to print here the amount of words. |
| 58 | +#We will print all the information asked at the end too. |
| 59 | + |
| 60 | +#print('The amount of words in first_paragraph is: ', words) |
| 61 | + |
| 62 | + |
| 63 | +# Part 2: How many sentences are in the first paragraph? |
| 64 | + |
| 65 | +#We want to keep first_paragraph without changes, so we create a copy, in this |
| 66 | +#case the copy will have replce ? by . and also \n by '' . |
| 67 | + |
| 68 | +#First replace ? by . |
| 69 | +sentence_paragraph = first_paragraph.replace('?', '.') |
| 70 | +#Now replace \n by '' |
| 71 | +sentence_paragraph = sentence_paragraph.replace('\n', '') |
| 72 | + |
| 73 | +#Split the paragraph into sentences |
| 74 | +sentence_list = sentence_paragraph.split('.') |
| 75 | + |
| 76 | +#Uncomment to print how it looks our sentence_list |
| 77 | +#print(paragraph_list) |
| 78 | + |
| 79 | +#Let's remove the '' elements |
| 80 | +sentence_list = list(filter(lambda item: item!='' , sentence_list)) |
| 81 | + |
| 82 | +#Now our sentences_list just contains the sentences, let's use len() to count |
| 83 | +# the amount of sentences. |
| 84 | +sentences = len(sentence_list) |
| 85 | + |
| 86 | +#Uncomment next print line to print here the amount of sentences. |
| 87 | +#We will print all the information asked at the end too. |
| 88 | + |
| 89 | +#print('The amount of sentences in first_paragraph is: ', sentences) |
| 90 | + |
| 91 | +#Part 3 : How many times is the word 'Obama' is in the first paragraph? |
| 92 | +#The following is a comment |
| 93 | +''' |
| 94 | +If you read the paragraph you might have notice that in some parts the Word |
| 95 | +Obama appears as "Obama's". We want to count that case; therefore, we will look |
| 96 | +for the string 'Obama' in wach word of the word_list. If the string is in the |
| 97 | +word we will add 1 to the variable obama_count (initialized in 0). |
| 98 | +''' |
| 99 | +#code |
| 100 | +obama_count = 0 |
| 101 | +for word in words_list: |
| 102 | + if 'Obama' in word: |
| 103 | + obama_count +=1 |
| 104 | + |
| 105 | +#Uncomment next print line to print here the amount of sentences. |
| 106 | +#We will print all the information asked at the end too. |
| 107 | + |
| 108 | +#print('The word Obama in first_paragraph appears {} times.'.format(obama_count)) |
| 109 | + |
| 110 | +# Part 4: If you remove all of the punctuation and lower case all text, how many words? |
| 111 | + |
| 112 | +#Lower case the whole paragraph and save into lower_paragrpah |
| 113 | +lower_paragraph = first_paragraph.lower() |
| 114 | + |
| 115 | +#Remove punctuation |
| 116 | + |
| 117 | +#First we import the string constants available in python. |
| 118 | +import string |
| 119 | + |
| 120 | +#Let's print the string punctuation (uncomment next line to print). |
| 121 | +#print(string.punctuation) |
| 122 | + |
| 123 | +#loop in the character of string.punctuation and we replace the characters that appear |
| 124 | +#in our lower_paragraph for a space. |
| 125 | +for character in string.punctuation: |
| 126 | + lower_paragraph = lower_paragraph.replace(character, ' ') |
| 127 | + |
| 128 | +#If you print the new lowe_paragraph you'll notice that we missed some characters |
| 129 | +#We will remove them too, for detail explanation go to the jupyter notebook link |
| 130 | +# located in the header at the top of this script. |
| 131 | + |
| 132 | +more_punct = '’‘“”—\n' |
| 133 | +for character in more_punct: |
| 134 | + lower_paragraph = lower_paragraph.replace(character, ' ') |
| 135 | + |
| 136 | +#With all the possible punctuation removed, now let's count words. |
| 137 | +no_punctuation_list = lower_paragraph.split() |
| 138 | +words_no_punctuation = len(no_punctuation_list) |
| 139 | + |
| 140 | +#PRINTING ALL THE INFO REQUIRED TOGETHER: |
| 141 | + |
| 142 | +print('The amount of words in first_paragraph is: ', words) |
| 143 | +print('The amount of sentences in first_paragraph is: ', sentences) |
| 144 | +print('The word Obama in first_paragraph appears {} times.'.format(obama_count)) |
| 145 | +print('The amount of words in the paragraph with no punctuation is: ', words_no_punctuation) |
0 commit comments