Skip to content
This repository was archived by the owner on Nov 30, 2022. It is now read-only.

Commit 4be0c24

Browse files
authored
Merge pull request #261 from rutujadhanawade/times_of_india
Times of India news scraper
2 parents e0627bb + 17e6ab6 commit 4be0c24

File tree

3 files changed

+45
-0
lines changed

3 files changed

+45
-0
lines changed

Web-Scraping/Times_of_india/README.md

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
## Scraping Times of India
2+
3+
Scraping times of india top headlines in four domains : Flash news, News in Bulletin, Entertainment, Latest news.
4+
using REquests and Beautiful Soup Modules.
5+
6+
Link for Website - "http://timesofindia.indiatimes.com/"
7+
8+
![output](TOI.png)

Web-Scraping/Times_of_india/TOI.png

50.1 KB
Loading
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
import requests
2+
import datetime
3+
from bs4 import BeautifulSoup
4+
5+
url = "http://timesofindia.indiatimes.com/"
6+
7+
# Use requests library to get html from TOI's page
8+
response = requests.get(url)
9+
# Make the html soup object
10+
soup = BeautifulSoup(response.content, 'html.parser')
11+
12+
print("\t!!!** The Times of India **!!!")
13+
today = datetime.date.today()
14+
print(today.strftime('\tThe date %d, %b %Y'))
15+
16+
# scrping times of India in four domains:
17+
print("\n\t\t**** Flash news ****")
18+
for div in soup.findAll('div', attrs={'id':'featuredstory'}):
19+
for a in div.findAll('a'):
20+
print(a.text)
21+
22+
print("\n\t\t**** News in Bulletin ****")
23+
for div in soup.findAll('div', attrs={'class':'top-story'}):
24+
for a in div.findAll('li'):
25+
print (a.text)
26+
27+
28+
print("\n\t\t**** Entertainment ****\t")
29+
for div in soup.findAll('div', attrs={'class':'entrmnt-wdgt-outer'}):
30+
for a in div.findAll('li'):
31+
print(a.text)
32+
33+
34+
print("\n\t\t**** Latest News ****\t\n")
35+
for div in soup.findAll('div', attrs={'id':'lateststories'}):
36+
for a in div.findAll('li'):
37+
print(a.text)

0 commit comments

Comments
 (0)