快速安装
几行命令搞定
wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.3/elasticsearch-2.3.3.tar.gz
tar zxvf elasticsearch-2.3.3.tar.gz
cd elasticsearch
bin/elasticsearch
这样就启动了一个简单的ElasticSearch集群(单个节点)
[admin@iZeb3hq3afvi09ubvk0e7bZ bin]$ ./elasticsearch
[2017-01-11 10:00:26,975][INFO ][node ] [Jade Dragon] version[2.3.3], pid[1258], build[218bdf1/2016-05-17T15:40:04Z]
[2017-01-11 10:00:26,976][INFO ][node ] [Jade Dragon] initializing ...
[2017-01-11 10:00:27,469][INFO ][plugins ] [Jade Dragon] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
[2017-01-11 10:00:27,491][INFO ][env ] [Jade Dragon] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [72.2gb], net total_space [78.6gb], spins? [unknown], types [rootfs]
[2017-01-11 10:00:27,491][INFO ][env ] [Jade Dragon] heap size [1007.3mb], compressed ordinary object pointers [true]
[2017-01-11 10:00:27,492][WARN ][env ] [Jade Dragon] max file descriptors [65535] for elasticsearch process likely too low, consider increasing to at least [65536]
[2017-01-11 10:00:29,288][INFO ][node ] [Jade Dragon] initialized
[2017-01-11 10:00:29,289][INFO ][node ] [Jade Dragon] starting ...
[2017-01-11 10:00:29,412][INFO ][transport ] [Jade Dragon] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[2017-01-11 10:00:29,417][INFO ][discovery ] [Jade Dragon] elasticsearch/T_uerJ6OT8mHM763GklX4A
[2017-01-11 10:00:32,471][INFO ][cluster.service ] [Jade Dragon] new_master {Jade Dragon}{T_uerJ6OT8mHM763GklX4A}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2017-01-11 10:00:32,535][INFO ][http ] [Jade Dragon] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}
[2017-01-11 10:00:32,536][INFO ][node ] [Jade Dragon] started
[2017-01-11 10:00:32,547][INFO ][gateway ] [Jade Dragon] recovered [0] indices into cluster_state
生产环境配置
config/elasticsearch.yml
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please see the documentation for further information on configuration options:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html>
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
# cluster.name: my-application
# 集群名称
cluster:
name: es-quvideo
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
# node.name: node-1
# Node的名称,这里直接用HOSTNAME来表示
node:
name: ${HOSTNAME}
# Add custom attributes to the node:
#
# node.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
# path.data: /path/to/data
# 这个放在SSD上
path:
data: /mnt/elasticsearch/data
logs: /mnt/elasticsearch/logs
# Path to log files:
#
# path.logs: /path/to/logs
# path.logs: /mnt/elasticsearch/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
# bootstrap.mlockall: true
# 避免swap,即便你关闭了系统的swap,也建议把bootstrap.mlockall设为true
bootstrap.mlockall: true
# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory
# available on the system and that the owner of the process is allowed to use this limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
# network.host: 192.168.0.1
# 绑定端口,不要绑外网,绑内网就ok了
network.host: [_eth0_,_local_]
# Set a custom port for HTTP:
#
# http.port: 9200
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# 我们目前不需要,单节点的
# discovery.zen.ping.unicast.hosts: ["host1", "host2"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
#
# discovery.zen.minimum_master_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
# gateway.recover_after_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>
#
# ---------------------------------- Various -----------------------------------
#
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 1
#
# Require explicit names when deleting indices:
#
# action.destructive_requires_name: true
# action.auto_create_index: false
#------------------------------ Large import -----------------------------------
# for doing large import,should removed after job is finished
# dont care real-time accuracy on search results
# index.refresh_interval: -1
# Relicas will send the entire document to replicate node and repeated the indexing
# But when you enable replicas after large import finished,the recovery process is duplicatation
# Duplication is just a bye-for-byte network transfer,it is easier than replication
index.number_of_replicas: 1
# If you are using spinning media instead of SSD
# index.merge.scheduler.max_thread_count: 1
# If you are doing a bulk import and don’t care about search at all, you can disable merge throttling
# entirely. This will allow indexing to run as fast as your disks will allow
# indices.store.throttle.type: "none"
# This allows larger segments to accumulate in the translog before a flush occurs. By letting larger
# segments build, you flush less often, and the larger segments merge less often. All of this adds up
# to less disk I/O overhead and better indexing rates. Of course, you will need the corresponding amount
# of heap memory free to accumulate the extra buffering space, so keep that in mind when adjusting this setting
index.translog.flush_threshold_size: 1g
上面配置了数据和日志的存储路径,需要自己创建一下
mkdir -p /mnt/elasticsearch/data
mkdir /mnt/elasticsearch/logs
sudo chown -R admin:admin /mnt/elasticsearch/
JVM参数配置
在bin/elasticsearch配置ES_HEAP_SIZE,最好是总内存的一半
export ES_HEAP_SIZE=8g
系统配置
硬件
内存
elastic search内存消耗主要在两个方面:
JVM的heap,sorting和aggregations很吃内存
Lucene,需要大量内存做文件系统缓存
这两个一半一半,所以如果8G的机器,ES_HEAP_SIZE配成4G可以了
CPU
现在2核的够了,如果CPU吃紧,加核比加速好
磁盘
SSD吧
系统配置
关掉swap
sudo swapoff -a
或者通过vm.swappiness来关
文件描述符
Lucene 需要大量的文件,ElasticSearch也需要大量的sockets来和节点,http client通信。
建议配置成32k或者64k
虚拟内存
默认的mmap数量量限制满足不了ElasticSearch,需要调大一点
sysctl -w vm.max_map_count=262144
或者修改/etc/sysctl.conf
后台启动
后台启动,并将进程号写到pid文件
bin/elasticsearch -d -p pid
安全关闭,千万别加-9*
kill `cat pid`
安装分词器
wget https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip
unzip master.zip
yum install maven -y
mvn package
mkdir -p /home/admin/elasticsearch-2.3.3/plugins
cp elasticsearch-analysis-ik-1.9.3.zip /home/admin/elasticsearch-2.3.3/plugins/
cd /home/admin/elasticsearch-2.3.3/plugins/
unzip elasticsearch-analysis-ik-1.9.3.zip
rm elasticsearch-analysis-ik-1.9.3.zip
cd /home/admin/
chown -R admin:admin /home/admin/
安装logstash
wget https://download.elastic.co/logstash/logstash/logstash-2.3.3.tar.gz
tar zxvf https://download.elastic.co/logstash/logstash/logstash-2.3.3.tar.gz
chown -R admin:admin /home/admin/
几行命令搞定
wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/tar/elasticsearch/2.3.3/elasticsearch-2.3.3.tar.gz
tar zxvf elasticsearch-2.3.3.tar.gz
cd elasticsearch
bin/elasticsearch
这样就启动了一个简单的ElasticSearch集群(单个节点)
[admin@iZeb3hq3afvi09ubvk0e7bZ bin]$ ./elasticsearch
[2017-01-11 10:00:26,975][INFO ][node ] [Jade Dragon] version[2.3.3], pid[1258], build[218bdf1/2016-05-17T15:40:04Z]
[2017-01-11 10:00:26,976][INFO ][node ] [Jade Dragon] initializing ...
[2017-01-11 10:00:27,469][INFO ][plugins ] [Jade Dragon] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
[2017-01-11 10:00:27,491][INFO ][env ] [Jade Dragon] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [72.2gb], net total_space [78.6gb], spins? [unknown], types [rootfs]
[2017-01-11 10:00:27,491][INFO ][env ] [Jade Dragon] heap size [1007.3mb], compressed ordinary object pointers [true]
[2017-01-11 10:00:27,492][WARN ][env ] [Jade Dragon] max file descriptors [65535] for elasticsearch process likely too low, consider increasing to at least [65536]
[2017-01-11 10:00:29,288][INFO ][node ] [Jade Dragon] initialized
[2017-01-11 10:00:29,289][INFO ][node ] [Jade Dragon] starting ...
[2017-01-11 10:00:29,412][INFO ][transport ] [Jade Dragon] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}
[2017-01-11 10:00:29,417][INFO ][discovery ] [Jade Dragon] elasticsearch/T_uerJ6OT8mHM763GklX4A
[2017-01-11 10:00:32,471][INFO ][cluster.service ] [Jade Dragon] new_master {Jade Dragon}{T_uerJ6OT8mHM763GklX4A}{127.0.0.1}{127.0.0.1:9300}, reason: zen-disco-join(elected_as_master, [0] joins received)
[2017-01-11 10:00:32,535][INFO ][http ] [Jade Dragon] publish_address {127.0.0.1:9200}, bound_addresses {127.0.0.1:9200}
[2017-01-11 10:00:32,536][INFO ][node ] [Jade Dragon] started
[2017-01-11 10:00:32,547][INFO ][gateway ] [Jade Dragon] recovered [0] indices into cluster_state
生产环境配置
config/elasticsearch.yml
# ======================== Elasticsearch Configuration =========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
# Before you set out to tweak and tune the configuration, make sure you
# understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please see the documentation for further information on configuration options:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/setup-configuration.html>
#
# ---------------------------------- Cluster -----------------------------------
#
# Use a descriptive name for your cluster:
#
# cluster.name: my-application
# 集群名称
cluster:
name: es-quvideo
# ------------------------------------ Node ------------------------------------
#
# Use a descriptive name for the node:
#
# node.name: node-1
# Node的名称,这里直接用HOSTNAME来表示
node:
name: ${HOSTNAME}
# Add custom attributes to the node:
#
# node.rack: r1
#
# ----------------------------------- Paths ------------------------------------
#
# Path to directory where to store the data (separate multiple locations by comma):
#
# path.data: /path/to/data
# 这个放在SSD上
path:
data: /mnt/elasticsearch/data
logs: /mnt/elasticsearch/logs
# Path to log files:
#
# path.logs: /path/to/logs
# path.logs: /mnt/elasticsearch/logs
#
# ----------------------------------- Memory -----------------------------------
#
# Lock the memory on startup:
#
# bootstrap.mlockall: true
# 避免swap,即便你关闭了系统的swap,也建议把bootstrap.mlockall设为true
bootstrap.mlockall: true
# Make sure that the `ES_HEAP_SIZE` environment variable is set to about half the memory
# available on the system and that the owner of the process is allowed to use this limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network -----------------------------------
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#
# network.host: 192.168.0.1
# 绑定端口,不要绑外网,绑内网就ok了
network.host: [_eth0_,_local_]
# Set a custom port for HTTP:
#
# http.port: 9200
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-network.html>
#
# --------------------------------- Discovery ----------------------------------
#
# Pass an initial list of hosts to perform discovery when new node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
# 我们目前不需要,单节点的
# discovery.zen.ping.unicast.hosts: ["host1", "host2"]
#
# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):
#
# discovery.zen.minimum_master_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-discovery.html>
#
# ---------------------------------- Gateway -----------------------------------
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
# gateway.recover_after_nodes: 3
#
# For more information, see the documentation at:
# <http://www.elastic.co/guide/en/elasticsearch/reference/current/modules-gateway.html>
#
# ---------------------------------- Various -----------------------------------
#
# Disable starting multiple nodes on a single system:
#
# node.max_local_storage_nodes: 1
#
# Require explicit names when deleting indices:
#
# action.destructive_requires_name: true
# action.auto_create_index: false
#------------------------------ Large import -----------------------------------
# for doing large import,should removed after job is finished
# dont care real-time accuracy on search results
# index.refresh_interval: -1
# Relicas will send the entire document to replicate node and repeated the indexing
# But when you enable replicas after large import finished,the recovery process is duplicatation
# Duplication is just a bye-for-byte network transfer,it is easier than replication
index.number_of_replicas: 1
# If you are using spinning media instead of SSD
# index.merge.scheduler.max_thread_count: 1
# If you are doing a bulk import and don’t care about search at all, you can disable merge throttling
# entirely. This will allow indexing to run as fast as your disks will allow
# indices.store.throttle.type: "none"
# This allows larger segments to accumulate in the translog before a flush occurs. By letting larger
# segments build, you flush less often, and the larger segments merge less often. All of this adds up
# to less disk I/O overhead and better indexing rates. Of course, you will need the corresponding amount
# of heap memory free to accumulate the extra buffering space, so keep that in mind when adjusting this setting
index.translog.flush_threshold_size: 1g
上面配置了数据和日志的存储路径,需要自己创建一下
mkdir -p /mnt/elasticsearch/data
mkdir /mnt/elasticsearch/logs
sudo chown -R admin:admin /mnt/elasticsearch/
JVM参数配置
在bin/elasticsearch配置ES_HEAP_SIZE,最好是总内存的一半
export ES_HEAP_SIZE=8g
系统配置
硬件
内存
elastic search内存消耗主要在两个方面:
JVM的heap,sorting和aggregations很吃内存
Lucene,需要大量内存做文件系统缓存
这两个一半一半,所以如果8G的机器,ES_HEAP_SIZE配成4G可以了
CPU
现在2核的够了,如果CPU吃紧,加核比加速好
磁盘
SSD吧
系统配置
关掉swap
sudo swapoff -a
或者通过vm.swappiness来关
文件描述符
Lucene 需要大量的文件,ElasticSearch也需要大量的sockets来和节点,http client通信。
建议配置成32k或者64k
虚拟内存
默认的mmap数量量限制满足不了ElasticSearch,需要调大一点
sysctl -w vm.max_map_count=262144
或者修改/etc/sysctl.conf
后台启动
后台启动,并将进程号写到pid文件
bin/elasticsearch -d -p pid
安全关闭,千万别加-9*
kill `cat pid`
安装分词器
wget https://github.com/medcl/elasticsearch-analysis-ik/archive/master.zip
unzip master.zip
yum install maven -y
mvn package
mkdir -p /home/admin/elasticsearch-2.3.3/plugins
cp elasticsearch-analysis-ik-1.9.3.zip /home/admin/elasticsearch-2.3.3/plugins/
cd /home/admin/elasticsearch-2.3.3/plugins/
unzip elasticsearch-analysis-ik-1.9.3.zip
rm elasticsearch-analysis-ik-1.9.3.zip
cd /home/admin/
chown -R admin:admin /home/admin/
安装logstash
wget https://download.elastic.co/logstash/logstash/logstash-2.3.3.tar.gz
tar zxvf https://download.elastic.co/logstash/logstash/logstash-2.3.3.tar.gz
chown -R admin:admin /home/admin/