Requests爬取表格数据并存入CSV中

本文介绍了一种使用Python的requests和lxml库从新浪财经网站抓取融资融券数据的方法,并展示了如何将抓取的数据整理成DataFrame格式,最后保存为CSV文件。此外,还提供了两种读取CSV文件的方法。
import requests
from lxml import etree
import pandas as pd

a = requests.get('http://vip.stock.finance.sina.com.cn/q/go.php/vInvestConsult/kind/rzrq/index.phtml')
# print(a.status_code)
# print(a.text.encode('utf-8').decode('gbk'))
select = etree.HTML(a.text)
tr_a = select.xpath('//table[2]/tr/td/a/text()')
tr_num = select.xpath('//table[2]/tr/td/text()')
lists = []
list1 = []

for i in tr_a[1::2]:
    a = i.encode('ISO-8859-1').decode('gbk')
    lists.append(a)

for i in tr_a[0::2]:
    a = i.encode('ISO-8859-1').decode('gbk')
    list1.append(a)

num_xh = tr_num[10::9]
# print(list1)
# print(lists)
num_ye = tr_num[11::9]
num_mre = tr_num[12::9]
num_che = tr_num[13::9]
num_ylje = tr_num[14::9]
num_yl = tr_num[15::9]
num_mcl = tr_num[16::9]
num_chl = tr_num[17::9]
num_rqye = tr_num[18::9]

data = pd.DataFrame({'序号':num_xh,'股票代码':list1,'股票名称':lists,'余额':num_ye,'买入额':num_mre,'偿还额':num_che,
                     '余量金额':num_ylje,'余量':num_yl,'卖出量':num_mcl,'偿还量':num_chl,'融券余额':num_rqye})
data.to_csv("demo.csv",index=False,sep=',')

 

 

1 #pandas读csv方法
2 
3 import pandas as pd
4 data = pd.read_csv('demo.csv')

 

 1 # 也可用CSV模块
 2 import csv
 3 
 4 #python2可以用file替代open
 5 with open("tdemo.csv","wb") as csvfile: 
 6     writer = csv.writer(csvfile)
 7 
 8     
 9     writer.writerow(["序号","姓名","性别"])
10 
11     writer.writerows([[0,'张三',''],[1,'李四',''],[2,'王五','']])

 

1 # 读取csv文件用reader
2 import csv
3 with open("tdemo.csv","r") as csvfile:
4     reader = csv.reader(csvfile)
5     for line in reader:
6         print line

 

转载于:https://www.cnblogs.com/zhouzhishuai/p/8245546.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值