griffin0.4.0安装
安装步骤
Apache Griffin是大数据的开源数据质量解决方案,支持批处理和流模式。它提供了一个统一的流程,可以从不同角度衡量您的数据质量,帮助您构建可信赖的数据资产,从而提高您对业务的信心。
安装依赖
依赖准备(参考官网http://griffin.apache.org/docs/quickstart-cn.html)
JDK (1.8 or later versions)
MySQL(version 5.6及以上)
Hadoop (2.6.0 or later)
Hive (version 2.x)
Spark (version 2.2.1)
Livy(livy-0.5.0-incubating)
ElasticSearch (5.0 or later versions)
组件介绍
Apache Hadoop:批量数据源,存储指标数据
Apache Hive: Hive Metastore
Apache Spark: 计算批量、实时指标
Apache Livy: 为服务提供 RESTful API 调用 Apache Spark
MySQL: 服务元数据
ElasticSearch:存储指标数据
Maven:项目管理工具软件,用于将griffin项目打包,后续执行griffin会用jar包运行,如果是生产库,则在本地安装maven打包后将jar包放到平台中运行griffin,因为maven运行时会安装好多组件,所以会需要外网。
相关链接:
hadoop安装:https://blog.youkuaiyun.com/genus_yang/article/details/87917853
hive安装:https://blog.youkuaiyun.com/genus_yang/article/details/87938796
spark安装:https://blog.youkuaiyun.com/genus_yang/article/details/88018392
livy安装:https://blog.youkuaiyun.com/genus_yang/article/details/88027799
mysql安装:https://blog.youkuaiyun.com/genus_yang/article/details/87939556
ElasticSearch安装:https://blog.youkuaiyun.com/genus_yang/article/details/88051980
maven下载链接:
http://maven.apache.org/download.cgi
虚拟机nat连接外网链接:https://blog.youkuaiyun.com/qq_40612124/article/details/79084276
解压griffin压缩包
[hadoop@master ~]$ unzip griffin-0.4.0-source-release.zip
在mysql中建立griffin用户
[root@master ~]# mysql -u root -p123
mysql> create user ‘griffin’ identified by ‘123’;
mysql> grant all privileges on . to ‘griffin’@’%’ with grant option;
mysql> grant all privileges on . to griffin@master identified by ‘123’;
mysql> flush privileges;
griffin依赖表创建
Griffin 使用了 Quartz 调度器调度任务,需要在mysql中创建 Quartz 调度器依赖的表
[root@master ~]# mysql -h master -u griffin -p123 -e "create database quartz "
[root@master ~]# mysql -h master -u griffin -p123 quartz < /home/hadoop/griffin-0.4.0/service/src/main/resources/Init_quartz_mysql_innodb.sql
Hadoop和Hive
#创建/home/spark_conf目录
[hadoop@master ~]$ hadoop fs -mkdir -p /home/spark_conf
#上传hive的配置文件hive-site.xml
[hadoop@master ~]$ hadoop fs -put /home/hadoop/hive-3.1.1/conf/hive-site.xml /home/spark_conf/
Livy配置
更新livy/conf下的livy.conf配置文件
[hadoop@master ~]$ cd livy-0.5.0/conf/
[hadoop@master conf]$ vi livy.conf<