@@ -45,10 +45,10 @@ This is a long and complex 176-page document with a lot of detail. If
45
45
you find it interesting, feel free to read it all. But if you take a
46
46
look around page 36 of RFC2616 you will find the syntax for the GET
47
47
request. To request a document from a web server, we make a connection
48
- to the ` www.py4inf .com ` server on port 80, and then send a
48
+ to the ` www.pythonlearn .com ` server on port 80, and then send a
49
49
line of the form
50
50
51
- ` GET http://www.py4inf .com/code/romeo.txt HTTP/1.0 `
51
+ ` GET http://www.pythonlearn .com/code/romeo.txt HTTP/1.0 `
52
52
53
53
where the second parameter is the web page we are requesting, and then
54
54
we also send a blank line. The web server will respond with some header
@@ -66,8 +66,8 @@ display what the server sends back.
66
66
import socket
67
67
68
68
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
69
- mysock.connect(('www.py4inf .com', 80))
70
- mysock.send('GET http://www.py4inf .com/code/romeo.txt HTTP/1.0\n\n')
69
+ mysock.connect(('www.pythonlearn .com', 80))
70
+ mysock.send('GET http://www.pythonlearn .com/code/romeo.txt HTTP/1.0\n\n')
71
71
72
72
while True:
73
73
data = mysock.recv(512)
@@ -78,7 +78,7 @@ display what the server sends back.
78
78
mysock.close()
79
79
80
80
First the program makes a connection to port 80 on the server
81
- [ www.py4inf .com ] ( www.py4inf .com ) . Since our program is playing the role
81
+ [ www.pythonlearn .com ] ( www.pythonlearn .com ) . Since our program is playing the role
82
82
of the "web browser", the HTTP protocol says we must send the GET
83
83
command followed by a blank line.
84
84
@@ -142,8 +142,8 @@ image data to a file as follows:
142
142
import time
143
143
144
144
mysock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
145
- mysock.connect(('www.py4inf .com', 80))
146
- mysock.send('GET http://www.py4inf .com/cover.jpg HTTP/1.0\n\n')
145
+ mysock.connect(('www.pythonlearn .com', 80))
146
+ mysock.send('GET http://www.pythonlearn .com/cover.jpg HTTP/1.0\n\n')
147
147
148
148
count = 0
149
149
picture = "";
@@ -268,7 +268,7 @@ using `urllib` is as follows:
268
268
269
269
import urllib.request, urllib.parse, urllib.error
270
270
271
- fhand = urllib.request.urlopen('http://www.py4inf .com/code/romeo.txt')
271
+ fhand = urllib.request.urlopen('http://www.pythonlearn .com/code/romeo.txt')
272
272
for line in fhand:
273
273
print(line.strip())
274
274
@@ -292,7 +292,7 @@ file as follows:
292
292
import urllib.request, urllib.parse, urllib.error
293
293
294
294
counts = dict()
295
- fhand = urllib.request.urlopen('http://www.py4inf .com/code/romeo.txt')
295
+ fhand = urllib.request.urlopen('http://www.pythonlearn .com/code/romeo.txt')
296
296
for line in fhand:
297
297
words = line.split()
298
298
for word in words:
@@ -381,12 +381,12 @@ When we run the program, we get the following output:
381
381
http://www.dr-chuck.com/page2.htm
382
382
383
383
python urlregex.py
384
- Enter - http://www.py4inf .com/book.htm
384
+ Enter - http://www.pythonlearn .com/book.htm
385
385
http://www.greenteapress.com/thinkpython/thinkpython.html
386
386
http://allendowney.com/
387
- http://www.py4inf .com/code
387
+ http://www.pythonlearn .com/code
388
388
http://www.lib.umich.edu/espresso-book-machine
389
- http://www.py4inf .com/py4inf-slides.zip
389
+ http://www.pythonlearn .com/py4inf-slides.zip
390
390
391
391
Regular expressions work very nicely when your HTML is well formatted
392
392
and predictable. But since there are a lot of "broken" HTML pages out
@@ -453,12 +453,12 @@ When the program runs it looks as follows:
453
453
http://www.dr-chuck.com/page2.htm
454
454
455
455
python urllinks.py
456
- Enter - http://www.py4inf .com/book.htm
456
+ Enter - http://www.pythonlearn .com/book.htm
457
457
http://www.greenteapress.com/thinkpython/thinkpython.html
458
458
http://allendowney.com/
459
459
http://www.si502.com/
460
460
http://www.lib.umich.edu/espresso-book-machine
461
- http://www.py4inf .com/code
461
+ http://www.pythonlearn .com/code
462
462
http://www.pythonlearn.com/
463
463
464
464
You can use BeautifulSoup to pull out various parts of each tag as
@@ -509,7 +509,7 @@ entire contents of the document into a string variable
509
509
(` img ` ) then write that information to a local file as
510
510
follows:
511
511
512
- img = urllib.request.urlopen('http://www.py4inf .com/cover.jpg').read()
512
+ img = urllib.request.urlopen('http://www.pythonlearn .com/cover.jpg').read()
513
513
fhand = open('cover.jpg', 'w')
514
514
fhand.write(img)
515
515
fhand.close()
@@ -529,7 +529,7 @@ using up all of the memory you have in your computer.
529
529
530
530
import urllib.request, urllib.parse, urllib.error
531
531
532
- img = urllib.request.urlopen('http://www.py4inf .com/cover.jpg')
532
+ img = urllib.request.urlopen('http://www.pythonlearn .com/cover.jpg')
533
533
fhand = open('cover.jpg', 'w')
534
534
size = 0
535
535
while True:
@@ -556,11 +556,11 @@ follows:
556
556
557
557
\index{curl}
558
558
559
- curl -O http://www.py4inf .com/cover.jpg
559
+ curl -O http://www.pythonlearn .com/cover.jpg
560
560
561
561
The command ` curl ` is short for "copy URL" and so these two
562
562
examples are cleverly named ` curl1.py ` and
563
- ` curl2.py ` on [ www.py4inf .com/code ] ( www.py4inf .com/code ) as
563
+ ` curl2.py ` on [ www.pythonlearn .com/code ] ( www.pythonlearn .com/code ) as
564
564
they implement similar functionality to the ` curl ` command.
565
565
There is also a ` curl3.py ` sample program that does this task
566
566
a little more effectively, in case you actually want to use this pattern
0 commit comments