python3 爬虫1 urllib.request.urlopen打开一个远程URL对象 - python基础

Python绝技：运用Python成为顶级黑客 pdf原版

代码1：

import urllib.request
for urlcontent in urllib.request.urlopen('http://www.3qphp.com/python/Pythoncode/154.html'):
    urlcontent = urlcontent.decode('utf-8')
    print(urlcontent)

代码2：

import urllib.request
response = urllib.request.urlopen('http://www.3qphp.com/python/Pythoncode/154.html')
response = response.read();
response = response.decode('utf-8')
print(response)

代码3：

response = urllib.request.urlopen('http://www.3qphp.com/python/Pythoncode/154.html')
print(response)

#print() 得到的结果 <http.client.HTTPResponse object at 0x02075370>

说明：

1、代码1和代码2 print()打印的结果相同。但是获取内容时一个使用了for循环，一个使用了read()函数，注释掉

urlcontent = urlcontent.decode('utf-8')和response = response.decode('utf-8')就会看到他们格式上的差异。

2、代码3说明urllib.request.urlopen返回http.client.HTTPResponse对象（http.client模块下的）

respons对象包含header，status code，body

文档原文：

For http and https urls, this function returns a http.client.HTTPResponse object which has the following HTTPResponse Objects methods.

通过HTTPResponse.read()方法，读取和返回response的body部分

3、HTTPResponse</a>.read()获取的数据类型为bytes类型，需要decode()解码，转换成str类型

转载请注明：谷谷点程序 » python3 爬虫1 urllib.request.urlopen打开一个远程URL对象