Python 操作ES(elasticsearch-py)

该代码示例展示了如何从Elasticsearch检索数据,并使用pykafka将数据发送到Kafka主题,然后将数据批量写入另一个Elasticsearch实例。主要涉及Elasticsearch的搜索、bulk API以及Kafka的生产者操作。
# -*- coding: utf-8 -*-

#http://www.cnblogs.com/letong/p/4749234.html
#http://elasticsearch-py.readthedocs.io/en/master/api.html#elasticsearch
#http://blog.youkuaiyun.com/xiaoxinwenziyao/article/details/49471977
#https://github.com/Parsely/pykafka

from datetime import datetime
from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk

#使用kafka 走正式流程操作
from pykafka import KafkaClient
topicName = ""
kafkaHosts = ""
client = KafkaClient(hosts=kafkaHosts)
topic = client.topics[topicName]
producer = topic.get_producer()

es = Elasticsearch([{"host":"","port":9200,"timeout":15000}])
es_Test = Elasticsearch([{"host":"","port":9200}])

#{"method":"save","data":[{},{}]}
# def save4Kafka(result):
# 	DATAS=[]
# 	for rdata in result["hits"]["hits"]:
# 		source = rdata["_source"]
# 		DATAS.append(source)

# 	producer.produce({"method":"save","data":DATAS}.toString)
#
def save4ES(result):
	ACTIONS=[];
	for rdata in result["hits"]["hits"]:
		source = rdata["_source"]
		action = {
			"_index":indexName,
			"_type":typeName,
			"_source":source
		}
		ACTIONS.append(action)

	success = bulk(es_Test,ACTIONS,index=indexName,raise_on_error=True)

	print success,x,page

indexName = ""
typeName = ""

#总条数
count = es.count(index=indexName)["count"]

#每页多少条
pageLine = 1000;

#多少页
# page = (count&pageLine) == 0?(count/pageLine):(count/pageLine+1)
page = count/pageLine if (count%pageLine) == 0 else count/pageLine+1

#获取数据.分页获取
for x in xrange(7233,page):
	result = es.search(index=indexName,from_=x*pageLine,size=pageLine)
	# save4Kafka(result)
	save4ES(result)
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值