爬取雪球网房产新闻

本文介绍了一个使用Python实现的雪球网站API爬虫程序,该程序通过发送HTTP请求获取雪球公共时间线上的数据,并将数据解析后存入MySQL数据库。文章详细展示了如何设置请求头、解析JSON响应及与数据库交互的过程。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

xueqiu.py

import requests
import json
import mysql_test

def xueqiu_urllib(num):
    max_id = -1
    count = 10
    mc = mysql_test.mysql_connect()
    for i in range(num):
        url = 'https://xueqiu.com/v4/statuses/public_timeline_by_category.json?since_id=-1&max_id=%d&count=%d&category=111' % (max_id, count)
        print(url)
        headers = {
            'Cookie': 'aliyungf_tc=AQAAABkB71wfmgMAUhVFeQHZt5cWi5JS; xq_a_token=584d0cf8d5a5a9809761f2244d8d272bac729ed4; xq_a_token.sig=x0gT9jm6qnwd-ddLu66T3A8KiVA; xq_r_token=98f278457fc4e1e5eb0846e36a7296e642b8138a; xq_r_token.sig=2Uxv_DgYTcCjz7qx4j570JpNHIs; _ga=GA1.2.2047922625.1534296120; _gid=GA1.2.1377956894.1534296120; _gat_gtag_UA_16079156_4=1; Hm_lvt_1db88642e346389874251b5a1eded6e3=1534296121; Hm_lpvt_1db88642e346389874251b5a1eded6e3=1534296121; u=481534296121671; device_id=80dc1911da66774de0c40901630b0c6c',
            'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3377.1 Safari/537.36',
        }
        response = requests.get(url, headers=headers)
        res_dict = json.loads(response.text)
        # key_list = ['id', 'title', 'description', 'target']
        for i in res_dict['list']:
            dict = json.loads(i['data'])
            # for j in key_list:
            #     print(j + ': {}'.format(dict[j]))
            sql = "insert into xueqiu values ({},'{}','{}','{}')".format(str(dict['id']), dict['title'], dict['description'], dict['target'])
            mc.mysql_execute_modify(sql)
        max_id = res_dict['next_max_id']
        count = 15

if __name__ == '__main__':
    num = 3
    xueqiu_urllib(num)

mysql_test.py

import pymysql


class mysql_connect(object):
    def __init__(self):
        self.db = pymysql.connect(host='127.0.0.1', user='root', password='123456', database='py10')
        self.cursor = self.db.cursor()

    def mysql_execute_modify(self, sql):
        self.cursor.execute(sql)
        self.db.commit()
        return self.cursor.fetchall()

    def __del__(self):
        self.cursor.close()
        self.db.close()


if __name__ == '__main__':
    mc = mysql_connect()
    sql = 'select * from py10'
    res = mc.mysql_execute_modify(sql)
    print(res)

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值