一、说在前面的废话
最近在工作中研究分布式MPP数据库Greenplum多节点集群(至少4台及其以上节点)的一键部署安装包的制作,无意间在查看Greenplum的官网时发现了它基于ansible的相关文档,于是开始深入了解ansible及ansible-playbook的使用,并顺利实现了一个Greenplum的一键部署安装包。
VMware Greenplum Documentation
接下来介绍下ansible吧。如果只是单机版玩玩Greenplum的话,可参考这篇文章:Greenplum分布式数据库的一键安装方法汇总_greenplum分布式一键安装及卸载脚本-优快云博客
二、ansible概述
Ansible是一个开源配置管理工具,可以使用它来自动化任务,部署应用程序实现IT基础架构。Ansible可以用来自动化日常任务,比如,服务器的初始化配置、安全基线配置、更新和打补丁系统,安装软件包等。Ansible架构相对比较简单,仅需通过SSH连接客户机执行任务即可:
ansible的工程目录结构通常如下:
其优点总结如下:
- 无需客户端
与Chef、Puppet以及Saltstack(现在也支持Agentless方式salt-ssh)不同,Ansible是无客户端Agent的,所以无需在客户机上安装或配置任何程序,就可以运行Ansible任务。由于Ansible不会在客户机上安装任何软件或运行监听程序,因此消除了许多管理开销,我们可以在即可上手使用Ansible管理服务器,同时Ansible的更新也不会影响任何客户机。
- 使用SSH进行通讯
默认情况下,Ansible使用SSH协议在管理机和客户机之间进行通信。可以使用SFTP与客户机进行安全的文件传输。
- 并行执行
Ansible与客户机并行通信,可以更快地运行自动化任务。默认情况下,forks值为5,可以按需,在配置文件中增大该值。
三、基于ansible的Greenplum分布式部署安装包制作
经查阅Greenplum官方的安装部署说明文档,可以理顺出greenplum在每台机器节点上的安装步骤,文档地址:
VMware Greenplum Documentation
分析可知,绝大步骤在每台机器上的配置基本相同,只有一些特定的安装步骤需要在master节点上进行,大致内容如下:
- 【所有节点上】Greenplum各个主机节点的环境配置
- 【所有节点上】创建服务器Greenplum主机账号
- 【所有节点上】Greenplum数据库RPM包的安装
- 【所有节点上】Greenplum数据目录的配置
- 【所有节点上】账号间免登录配置
- 【master节点上】Greenplum数据库的初始化与启动
第一步骤:进行【所有节点上】的安装与配置:
#!/usr/bin/env ansible-playbook
---
- hosts: all
vars_files:
- vars/gpdb.yml
remote_user: root
become: yes
become_method: sudo
connection: ssh
gather_facts: yes
tasks:
- name: 01. stop and disable firewall service
shell: '{{ item }}'
with_items:
- 'systemctl unmask firewalld'
- 'systemctl start firewalld.service'
- 'systemctl stop firewalld.service'
- 'systemctl disable firewalld.service'
- name: 02. be sure expect is installed
yum: name=expect state=installed
- name: 03. close selinux temporary
shell: setenforce 0
failed_when: false
- name: 04. close selinux forever
when: ansible_distribution == "CentOS" and ansible_distribution_major_version == "7"
lineinfile:
dest: /etc/selinux/config
regexp: '^SELINUX='
line: 'SELINUX=disabled'
- name: 05. be sure ntp is installed
yum: name=ntp state=installed
tags: ntp
- name: 06. configure sync time using aliyun server
when: ansible_distribution == "CentOS" and ansible_distribution_major_version == "7"
cron: name="sync time" minute='*/5' hour=* day=* month=* weekday=* job="/usr/sbin/ntpdate -u ntp1.aliyun.com >/dev/null 2>&1"
ignore_errors: true
- name: 07. update configure file for etc-hosts
copy: src=gpnodes/hosts dest=/etc/hosts
- name: 08. change host name to etc-hostname
raw: 'echo {{hostname|quote}} > /etc/hostname'
- name: 09. change host name by command hostname
shell: hostname {{hostname|quote}}
- name: 10. create greenplum admin user
user:
name: '{{ greenplum_admin_user }}'
password: "{{ greenplum_admin_password | password_hash('sha512') }}"
- name: 11. copy greeplum rpm package to host
copy:
src: '{{ package_path }}'
dest: /tmp
- name: 12. backing up sysctl
copy:
src: /etc/sysctl.conf
remote_src: yes
dest: /tmp/sysctl.conf.bak
backup: yes
- name: 13. get shmall
shell: echo $(expr $(getconf _PHYS_PAGES) / 2)
register: shmall
- name: 14. get shmmax
shell: echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
register: shmmax
- name: 15. get min_free_kbytes
shell: awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print $2 * .03;}' /proc/meminfo
register: min_free_kbytes
- name: 16. set shmall
sysctl:
name: kernel.shmall
value: '{{ shmall.stdout }}'
reload: yes
- name: 17. set shmmax
sysctl:
name: kernel.shmmax
value: '{{ shmmax.stdout }}'
reload: yes
- name: 18. set min_free_kbytes
sysctl:
name: vm.min_free_kbytes
value: '{{ min_free_kbytes.stdout }}'
reload: yes
- name: 19. configure sysctl
sysctl:
name: '{{ item.key }}'
value: '{{ item.value }}'
sysctl_set: yes
state: present
reload: yes
ignoreerrors: yes
with_dict:
kernel.shmmni: 4096
vm.overcommit_memory: 2
vm.overcommit_ratio: 95
net.ipv4.ip_local_port_range: 10000 65535
kernel.sem: 500 2048000 200 40960
kernel.sysrq: 1
kernel.core_uses_pid: 1
kernel.msgmnb: 65536
kernel.msgmax: 65536
kernel.msgmni: 2048
net.ipv4.tcp_syncookies: 1
net.ipv4.conf.default.accept_source_route: 0
net.ipv4.tcp_max_syn_backlog: 4096
net.ipv4.conf.all.arp_filter: 1
net.core.netdev_max_backlog: 10000
net.core.rmem_max: 2097152
net.core.wmem_max: 2097152
vm.swappiness: 0
vm.zone_reclaim_mode: 0
vm.dirty_expire_centisecs: 10
vm.dirty_writeback_centisecs: 3
vm.dirty_background_ratio: 10
vm.dirty_ratio: 20
vm.dirty_background_bytes: 0
vm.dirty_bytes: 0
- name: 20. state PAM limits
pam_limits:
domain: '*'
limit_type: '-'
limit_item: '{{ item.key }}'
value: '{{ item.value }}'
with_dict:
nofile: 655360
nproc: 655360
memlock: unlimited
core: unlimited
- name: 21. install package
yum:
name: '/tmp/{{ package_path | basename }}'
# installroot: '{{ greenplum_install_directory }}'
state: present
- name: 22. cleanup package file from host
file:
path: '/tmp/{{ package_path | basename }}'
state: absent
- name: 23. find install directory
find:
paths: '{{ greenplum_install_directory }}'
patterns: 'greenplum-db*'
file_type: directory
register: installed_dir
- name: 24. change install directory ownership
file:
path: '{{ item.path }}'
owner: '{{ greenplum_admin_user }}'
group: '{{ greenplum_admin_user }}'
recurse: yes
with_items: '{{ installed_dir.files }}'
- name: 25. update pam_limits
pam_limits:
domain: '{{ greenplum_admin_user }}'
limit_type: '-'
limit_item: '{{ item.key }}'
value: '{{ item.value }}'
with_dict:
nofile: 524288
nproc: 131072
- name: 26. find installed greenplum version
shell: . '{{ greenplum_install_directory }}'/greenplum-db/greenplum_path.sh && '{{ greenplum_install_directory }}'/greenplum-db/bin/postgres --gp-version
register: postgres_gp_version
- name: 27. fail if the correct greenplum version is not installed
fail:
msg: "Expected greenplum version {{ version }}, but found '{{ postgres_gp_version.stdout }}'"
when: "version is not defined or version not in postgres_gp_version.stdout"
- name: 28. Create data directory if it does not exist
file:
path: '{{ item }}'
state: directory
mode: '0755'
with_items:
- '{{ greenplum_data_directory }}/'
- '{{ greenplum_data_directory }}/master/'
- '{{ greenplum_data_directory }}/primary/'
- '{{ greenplum_data_directory }}/mirror/'
- name: 29. copy greenplum node temporary files
copy:
src: '{{ item }}'
dest: '/home/{{ greenplum_admin_user }}/'
remote_src: no
with_items:
- gpnodes/all_hosts
- gpnodes/all_ips
- gpnodes/master_hosts
- gpnodes/standby_hosts
- gpnodes/segment_hosts
- name: 30. change data directory ownership
file:
path: '{{ greenplum_data_directory }}'
owner: '{{ greenplum_admin_user }}'
group: '{{ greenplum_admin_user }}'
recurse: yes
第二步骤:进行【master节点上】的配置与数据库初始化:
#!/usr/bin/env ansible-playbook
---
- hosts: all
vars_files:
- vars/gpdb.yml
remote_user: root
become: yes
become_method: sudo
connection: ssh
gather_facts: yes
tasks:
- name: 31. copy files for initialize greenplum master
copy:
src: '{{ item }}'
dest: '/home/{{ greenplum_admin_user }}/'
remote_src: no
with_items:
- gpnodes/gpadmin_hosts
- template/gpadmin_auto_ssh.sh
- template/initdb_gpdb.sql
- name: 32. replace greenplum admin user environment bash file
template: src=template/gpadmin_bashrc.j2 dest=/home/{{ greenplum_admin_user }}/.bashrc
- name: 33. copy and configure gpinitsystem config file
template: src=template/gpinitsystem_config.j2 dest=/home/{{ greenplum_admin_user }}/gpinitsystem_config
- name: 34. change data directory ownership
file:
path: '/home/{{ greenplum_admin_user }}/'
owner: '{{ greenplum_admin_user }}'
group: '{{ greenplum_admin_user }}'
recurse: yes
- name: 35. configure greenplum admin user auto login
command: sh /home/{{ greenplum_admin_user }}/gpadmin_auto_ssh.sh /home/{{ greenplum_admin_user }}/gpadmin_hosts
become: yes
become_user: '{{ greenplum_admin_user }}'
- name: 36. initialize greenplum master database
shell: '{{ item }}'
become: yes
become_method: su
become_flags: '-'
become_user: '{{ greenplum_admin_user }}'
with_items:
- "gpinitsystem -a -c /home/{{ greenplum_admin_user }}/gpinitsystem_config -h /home/{{ greenplum_admin_user }}/segment_hosts -s smdw"
- "psql -d postgres -U gpadmin -f /home/{{ greenplum_admin_user }}/initdb_gpdb.sql"
- "echo \"host all all 0.0.0.0/0 password\" >> /home/{{ greenplum_admin_user }}/data/master/gpseg-1/pg_hba.conf"
- "gpstop -u"
以上两大步骤完成后,一个分布式MPP数据库Greenplum就很快搭建起来了。辅助编写了很少的shell脚本即完成了一个靠人工大约要花费半天时间的工作量。
完整的安装包制作项目源代码请见:
项目地址:greenplum_installer: Greenplum6分布式数据库CentOS7系统下一键安装包
安装文档:https://gitee.com/inrgihc/greenplum_installer
四、总结
ansible里的知识还是很多的,当前也只是了解了最基本的使用,更多资料如下:
- (1) https://blog.youkuaiyun.com/workwithwebis3w/article/details/94617764
- (2) http://www.ansible.com.cn/index.html
- (3) https://docs.ansible.com/