Skip to content
This repository was archived by the owner on Nov 30, 2022. It is now read-only.

Commit 63360f9

Browse files
author
namrun
committed
Script to download Medium articles added
1 parent 6d59af5 commit 63360f9

File tree

1 file changed

+36
-0
lines changed

1 file changed

+36
-0
lines changed
+36
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
#!/usr/bin/env python3
2+
3+
#Imports and dependencies
4+
5+
import requests
6+
from bs4 import BeautifulSoup
7+
8+
#The content is written into a text file
9+
10+
file = open("Medium_article_content.txt", "w")
11+
12+
#The URL of the article is entered here
13+
page_url = input("Enter the URL of the Medium Article ")
14+
15+
#Based on the response got from the URL, the content is loaded into response
16+
17+
response = requests.get(page_url)
18+
19+
#Beautiful soup is a library used for web scraping and parsing the contents of a web page
20+
#Here a html parser is used to parse through the content embedded in the html tags
21+
22+
soup = BeautifulSoup(response.text,"html.parser")
23+
24+
#The content of the article is stored in the <article> tag
25+
26+
for line in soup.find('article').find('div'):
27+
28+
#All the content is essentially stored between <p> tags
29+
30+
for content in line.find_all('p'):
31+
32+
#contents are written into a file
33+
34+
file.write(content.text + '\n')
35+
36+
file.close()

0 commit comments

Comments
 (0)