一、背景
主要是想爬取湖北省武汉市江汉区的二手房数据:“title”, “house_type”, “area”, “decoration”, “price”,为了防止ip被封,用了kuaidaili进行爬取。
二、实战
鼠标放在标题上右键点击检查
查看其它内容同上

点击第二页出现右图

这样就得到了头部headers,放在请求项里去访问网址

import pandas as pd
import requests
from bs4 import BeautifulSoup
import random
# 构造url的request headers,伪装成正常用户
headers = {
'accept':'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'zh-CN,zh;q=0.9',
'cache-control': 'no-cache',
'cookie': 'aQQ_ajkguid=70C0288A-42CB-4C56-B8EF-8E90F8077A8C; sessid=13C76F04-9178-4EE8-B8B0-F00FE21F4F50; ajk-appVersion=; ctid=22; fzq_h=d23302afd92c82b304657a734e3950aa_1697613588983_b645e9292cff4c148c0e3fb2ff31662e_3746354997; id58=CrIej2Uvhxc/D8k8IRI2Ag==; twe=2; fzq_js_anjuke_ershoufang_pc=8e86fa86290dbac07d5de51dd3b9db13_1697615100824_23; obtain_by=1; xxzl_cid=817f908b661647889fa49debaab80d9c; xxzl_deviceid=lrdQ4FRXrfXyN2Qj/gRhBw2SQpTZ81igKeOBCkzlfzjPwEG8whpE1uKNvVqIOvXQ',
'host': 'wuhan.anjuke.com',
'pragma': 'no-cache',
'referer': 'https://wuhan.anjuke.com/sale/jianghana/p1/',
'sec-ch-ua': '"Google Chrome";v="117", "Not;A=Brand";v="8", "Chromium";v="117"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform'

最低0.47元/天 解锁文章
847





