Skip to content

Commit 2d01f82

Browse files
committed
added using proxies tutorial
1 parent bc79a68 commit 2d01f82

File tree

6 files changed

+103
-0
lines changed

6 files changed

+103
-0
lines changed

Diff for: README.md

+1
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ This is a repository of all the tutorials of [The Python Code](https://www.thepy
6565
- [How to Extract All Website Links in Python](https://www.thepythoncode.com/article/extract-all-website-links-python). ([code](web-scraping/link-extractor))
6666
- [How to Make an Email Extractor in Python](https://www.thepythoncode.com/article/extracting-email-addresses-from-web-pages-using-python). ([code](web-scraping/email-extractor))
6767
- [How to Convert HTML Tables into CSV Files in Python](https://www.thepythoncode.com/article/convert-html-tables-into-csv-files-in-python). ([code](web-scraping/html-table-extractor))
68+
- [How to Use Proxies to Anonymize your Browsing and Scraping using Python](https://www.thepythoncode.com/article/using-proxies-using-requests-in-python). ([code](web-scraping/using-proxies))
6869

6970
- ### [Python Standard Library](https://www.thepythoncode.com/topic/python-standard-library)
7071
- [How to Transfer Files in the Network using Sockets in Python](https://www.thepythoncode.com/article/send-receive-files-using-sockets-python). ([code](general/transfer-files/))

Diff for: web-scraping/using-proxies/README.md

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# [How to Use Proxies to Anonymize your Browsing and Scraping using Python](https://www.thepythoncode.com/article/using-proxies-using-requests-in-python)
2+
To run this:
3+
- `pip3 install -r requirements.txt`
4+
- If you want to use free available proxies, use `free_proxies.py`
5+
- If you want to use Tor network, make sure Tor is installed in your machine and the service is running. `tor_proxy.py`
6+
- If you want IP rotation on Tor, use `multiple_tor_proxies.py`

Diff for: web-scraping/using-proxies/free_proxies.py

+51
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
import requests
2+
import random
3+
from bs4 import BeautifulSoup as bs
4+
5+
def get_free_proxies():
6+
url = "https://free-proxy-list.net/"
7+
# get the HTTP response and construct soup object
8+
soup = bs(requests.get(url).content, "html.parser")
9+
proxies = []
10+
for row in soup.find("table", attrs={"id": "proxylisttable"}).find_all("tr")[1:]:
11+
tds = row.find_all("td")
12+
try:
13+
ip = tds[0].text.strip()
14+
port = tds[1].text.strip()
15+
host = f"{ip}:{port}"
16+
proxies.append(host)
17+
except IndexError:
18+
continue
19+
return proxies
20+
21+
22+
def get_session(proxies):
23+
# construct an HTTP session
24+
session = requests.Session()
25+
# choose one random proxy
26+
proxy = random.choice(proxies)
27+
session.proxies = {"http": proxy, "https": proxy}
28+
return session
29+
30+
31+
if __name__ == "__main__":
32+
# proxies = get_free_proxies()
33+
proxies = [
34+
'167.172.248.53:3128',
35+
'194.226.34.132:5555',
36+
'203.202.245.62:80',
37+
'141.0.70.211:8080',
38+
'118.69.50.155:80',
39+
'201.55.164.177:3128',
40+
'51.15.166.107:3128',
41+
'91.205.218.64:80',
42+
'128.199.237.57:8080',
43+
]
44+
for i in range(5):
45+
s = get_session(proxies)
46+
try:
47+
print("Request page with IP:", s.get("http://icanhazip.com", timeout=1.5).text.strip())
48+
except Exception as e:
49+
continue
50+
51+

Diff for: web-scraping/using-proxies/multiple_tor_proxies.py

+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
import requests
2+
from stem.control import Controller
3+
from stem import Signal
4+
5+
def get_tor_session():
6+
# initialize a requests Session
7+
session = requests.Session()
8+
# setting the proxy of both http & https to the localhost:9050
9+
# (Tor service must be installed and started in your machine)
10+
session.proxies = {"http": "socks5://localhost:9050", "https": "socks5://localhost:9050"}
11+
return session
12+
13+
def renew_connection():
14+
with Controller.from_port(port=9051) as c:
15+
c.authenticate()
16+
# send NEWNYM signal to establish a new clean connection through the Tor network
17+
c.signal(Signal.NEWNYM)
18+
19+
20+
if __name__ == "__main__":
21+
s = get_tor_session()
22+
ip = s.get("http://icanhazip.com").text
23+
print("IP:", ip)
24+
renew_connection()
25+
s = get_tor_session()
26+
ip = s.get("http://icanhazip.com").text
27+
print("IP:", ip)

Diff for: web-scraping/using-proxies/requirements.txt

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
bs4
2+
requests
3+
stem

Diff for: web-scraping/using-proxies/tor_proxy.py

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
import requests
2+
3+
4+
def get_tor_session():
5+
# initialize a requests Session
6+
session = requests.Session()
7+
# this requires a running Tor service in your machine and listening on port 9050 (by default)
8+
session.proxies = {"http": "socks5://localhost:9050", "https": "socks5://localhost:9050"}
9+
return session
10+
11+
12+
if __name__ == "__main__":
13+
s = get_tor_session()
14+
ip = s.get("http://icanhazip.com").text
15+
print("IP:", ip)

0 commit comments

Comments
 (0)