首先服务器或者虚拟机上要有zookeeper和Kafka0.10集群。详细安装步骤可以参考博客:
zookeeper安装:https://blog.youkuaiyun.com/weixin_43866709/article/details/88416431
Kafka0.10安装:https://blog.youkuaiyun.com/weixin_43866709/article/details/89241045
Nginx整合Kafka就是要将nginx服务器直接连到Kafka上,将Nginx中的数据直接发送给Kafka,这样不用经过flume,可以提高效率。
那么要想让Nginx直接连到Kafka上,要安装一个nginx-kafka插件。
安装nginx-kafka插件
我这里Kafka安装在L3,L4,L5这三台机器上面,Nginx安装在L1上面
1.安装git
yum install -y git
2.切换到/usr/local/src目录,然后将kafka的c客户端源码clone到本地
cd /usr/local/src
git clone https://github.com/edenhill/librdkafka
3.进入到librdkafka,然后进行编译
cd librdkafka
yum install -y gcc gcc-c++ pcre-devel zlib-devel
./configure
make && make install
4.安装nginx整合kafka的插件,进入到/usr/local/src,clone nginx整合kafka的源码
cd /usr/local/src
git clone https://github.com/brg-liuwei/ngx_kafka_module
5.进入到nginx的源码包目录下 (编译nginx,然后将将插件同时编译)
cd /usr/local/src/nginx-1.12.2
./configure --add-module=/usr/local/src/ngx_kafka_module/
make
make install
这里使用git从GitHub上下载源码的时候可能会非常慢,具体解决办法请参考博客:
解决git clone速度很慢的问题:https://blog.youkuaiyun.com/weixin_43866709/article/details/89434546
6.修改nginx的配置文件
cd /usr/local/nginx/conf
sudo vim nginx.conf
具体修改如下:
#user nobody;
worker_processes 1;
#error_log logs/error.log;
#error_log logs/error.log notice;
#error_log logs/error.log info;
#pid logs/nginx.pid;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
#log_format main '$remote_addr - $remote_user [$time_local] "$request" '
# '$status $body_bytes_sent "$http_referer" '
# '"$http_user_agent" "$http_x_forwarded_for"';
#access_log logs/access.log main;
sendfile on;
#tcp_nopush on;
#keepalive_timeout 0;
keepalive_timeout 65;
#gzip on;
kafka;
kafka_broker_list L3:9092 L4:9092 L5:9092;
server {
listen 80;
server_name L1;
#charset koi8-r;
#access_log logs/host.access.log main;
location = /kafka/track {
kafka_topic track;
}
location = /kafka/user {
kafka_topic user;
}
#error_page 404 /404.html;
# redirect server error pages to the static page /50x.html
#
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
}
下面解释一下修改的配置信息:
1.在http下面添加上:
kafka;
kafka_broker_list L3:9092 L4:9092 L5:9092; //后面跟的是Kafka集群所在的主机名。
2.在server下面添加上server_name L1; //后面是nginx所在机器的主机名
3.在server下面添加上:
location = /kafka/track {
kafka_topic track;
}
location = /kafka/user {
kafka_topic user;
}
这里是指定要往Kafka中的哪些topic中写数据。
注意:上面所有要填写主机名的地方,一定要提前将主机名和IP的映射关系添加到/etc/hosts文件中。
7.启动zk和kafka集群(创建topic)
启动zookeeper:zookeeper-3.4.9/bin/zkServer.sh start
在每台机器上启动Kafka:/bigdata/kafka_2.11-0.10.2.1/bin/kafka-server-start.sh -daemon /bigdata/kafka_2.11-0.10.2.1/config/server.properties
8.启动nginx,报错,找不到kafka.so.1的文件
error while loading shared libraries: librdkafka.so.1: cannot open shared object file: No such file or directory
9.加载so库
>echo "/usr/local/lib" >> /etc/ld.so.conf
>ldconfig
10.测试,向nginx中写入数据,然后观察kafka的消费者能不能消费到数据
curl localhost/kafka/track -d "message send to kafka topic"