1.Kylin的安装
1.1.安装指南
安装可查看官方安全指南:http://kylin.apache.org/cn/docs/install/index.html
Kylin的安装依赖hadoop、hive、spark、hbase等,具体可查看官网。
Kylin启动后的web ui:http://hadoop2:7070/kylin
初始用户名为:admin,密码为:KYLIN
[root@hadoop2 bin]# ./kylin.sh start
Retrieving hadoop conf dir...
KYLIN_HOME is set to /usr/local/apache-kylin-2.6.6-bin-hbase1x
Retrieving hive dependency...
Retrieving hbase dependency...
Retrieving hadoop conf dir...
Retrieving kafka dependency...
Retrieving Spark dependency...
Start to check whether we need to migrate acl tables
Retrieving hive dependency...
Retrieving hbase dependency...
Retrieving hadoop conf dir...
Retrieving kafka dependency...
Retrieving Spark dependency...
SLF4J: Class path contains multiple SLF4J bindings.
2.快速入门
2.1.数据准备
dept.txt:
10 ACCOUNTING 1700
20 RESEARCH 1800
30 SALES 1900
40 OPERATIONS 1700
对应的dept表:
create table if not exists kylin_exercise01.dept(
deptno int,
dname string,
loc int
)
row format delimited fields terminated by '\t';
emp.txt
7369 SMITH CLERK 7902 1980-12-17 800.00 20
7499 ALLEN SALESMAN 7698 1981-2-20 1600.00 300.00 30
7521 WARD SALESMAN 7698 1981-2-22 1250.00 500.00 30
7566 JONES MANAGER 7839 1981-4-2 2975.00 20
7654 MARTIN SALESMAN 7698 1981-9-28 1250.00 1400.00 30
7698 BLAKE MANAGER 7839 1981-5-1 2850.00 30
7782 CLARK MANAGER 7839 1981-6-9 2450.00 10
7788 SCOTT ANALYST 7566 1987-4-19 3000.00 20
7839 KING PRESIDENT 7566 1981-11-17 5000.00 10
7844 TURNER SALESMAN 7698 1981-9-8 1500.00 0.00 30
7876 ADAMS CLERK 7788 1987-5-23 1100.00 20
7900 JAMES CLERK 7698 1981-12-3 950.00 30
7902 FORD ANALYST 7566 1981-12-3 3000.00 20
7934 MILLER CLERK 7782 1982-1-23 1300.00 10
create external table if not exists kylin_exercise01.emp(
empno int,
ename string,
job string,
mgr int,
hiredate string,
sal double,
comm double,
deptno int
)
row format delimited fields terminated by '\t';
数据导入:
load data local inpath "/usr/local/dept.txt" into table kylin_exercise01.dept;
load data local inpath "/usr/local/emp.txt" into table kylin_exercise01.emp;
2.2.项目创建
详细过程略:
先新建Project --> Load Table(选择源数据、来自Hive,多张表用逗号分隔) --> New Model -->New Table。最后如下图,

选择Build即刻建模,可以在Monitor页面中查看任务运行情况

3.Kylin和其他BI工具集成
可以与Kylin结合使用的可视化工具很多,例如:
- ODBC:与Tableau、Excel、PowerBI等工具集成
- JDBC:与Saiku、BIRT等Java工具集成
- RestAPI:与JavaScript、Web网页集成
Kylin 开发团队还贡献了 Zepplin 的插件,也可以使用 Zepplin 来访问 Kylin 服务。更多可参考官网。
异常情况
1.异常分析:java.net.UnknownHostException: hadoop1:2181: Name or service not known
2021-06-19 09:36:45,186 INFO [main] zookeeper.ZooKeeper:100 : Client environment:java.library.path=/usr/local/hadoop/lib/native
2021-06-19 09:36:45,187 INFO [main] zookeeper.ZooKeeper:100 : Client environment:java.io.tmpdir=/tmp
2021-06-19 09:36:45,189 INFO [main] zookeeper.ZooKeeper:100 : Client environment:java.compiler=<NA>
2021-06-19 09:36:45,190 INFO [main] zookeeper.ZooKeeper:100 : Client environment:os.name=Linux
2021-06-19 09:36:45,190 INFO [main] zookeeper.ZooKeeper:100 : Client environment:os.arch=amd64
2021-06-19 09:36:45,190 INFO [main] zookeeper.ZooKeeper:100 : Client environment:os.version=3.10.0-1127.el7.x86_64
2021-06-19 09:36:45,190 INFO [main] zookeeper.ZooKeeper:100 : Client environment:user.name=root
2021-06-19 09:36:45,190 INFO [main] zookeeper.ZooKeeper:100 : Client environment:user.home=/root
2021-06-19 09:36:45,191 INFO [main] zookeeper.ZooKeeper:100 : Client environment:user.dir=/usr/local/apache-kylin-2.6.6-bin-hbase1x/bin
2021-06-19 09:36:45,192 INFO [main] zookeeper.ZooKeeper:438 : Initiating client connection, connectString=hadoop1:2181,hadoop2:2181,hadoop3:2181 sessionTimeout=90000 watcher=hconnection-0x4b3a45f10x0, quorum=hadoop1:2181,hadoop2:2181,hadoop3:2181, baseZNode=/hbase
2021-06-19 09:36:45,278 INFO [main-SendThread(hadoop1:2181)] zookeeper.ClientCnxn:975 : Opening socket connection to server hadoop1/192.168.80.201:2181. Will not attempt to authenticate using SASL (unknown error)
2021-06-19 09:36:45,291 INFO [main-SendThread(hadoop1:2181)] zookeeper.ClientCnxn:852 : Socket connection established to hadoop1/192.168.80.201:2181, initiating session
2021-06-19 09:36:45,343 INFO [main-SendThread(hadoop1:2181)] zookeeper.ClientCnxn:1235 : Session establishment complete on server hadoop1/192.168.80.201:2181, sessionid = 0x17a228970a60006, negotiated timeout = 40000
2021-06-19 09:36:46,431 DEBUG [main] hbase.HBaseConnection:180 : Using the working dir FS for HBase: hdfs://hadoop1:9000
2021-06-19 09:36:46,641 INFO [main] imps.CuratorFrameworkImpl:224 : Starting
2021-06-19 09:36:46,650 INFO [main] zookeeper.ZooKeeper:438 : Initiating client connection, connectString=hadoop1:2181:2181,hadoop2:2181:2181,hadoop3:2181:2181 sessionTimeout=120000 watcher=org.apache.curator.ConnectionState@46c670a6
2021-06-19 09:36:46,657 ERROR [main] imps.CuratorFrameworkImpl:546 : Background exception was not retry-able or retry gave up
java.net.UnknownHostException: hadoop1:2181: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)
at java.net.InetAddress.getAllByName0(InetAddress.java:1276)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:61)
at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
at org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeeperFactory.java:29)
at org.apache.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:146)
at org.apache.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:94)
at org.apache.curator.HandleHolder.getZooKeeper(HandleHolder.java:55)
at org.apache.curator.ConnectionState.reset(ConnectionState.java:218)
at org.apache.curator.ConnectionState.start(ConnectionState.java:102)
at org.apache.curator.CuratorZookeeperClient.start(CuratorZookeeperClient.java:189)
at org.apache.curator.framework.imps.CuratorFrameworkImpl.start(CuratorFrameworkImpl.java:248)
at org.apache.kylin.storage.hbase.util.ZookeeperDistributedLock$Factory.getZKClient(ZookeeperDistributedLock.java:85)
at org.apache.kylin.storage.hbase.util.ZookeeperDistributedLock$Factory.<init>(ZookeeperDistributedLock.java:109)
at org.apache.kylin.storage.hbase.util.ZookeeperDistributedLock$Factory.<init>(ZookeeperDistributedLock.java:105)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.kylin.common.util.ClassUtil.newInstance(ClassUtil.java:88)
at org.apache.kylin.common.KylinConfigBase.getDistributedLockFactory(KylinConfigBase.java:458)
at org.apache.kylin.storage.hbase.HBaseConnection.createHTableIfNeeded(HBaseConnection.java:336)
at org.apache.kylin.storage.hbase.HBaseResourceStore.createHTableIfNeeded(HBaseResourceStore.java:114)
at org.apache.kylin.storage.hbase.HBaseResourceStore.<init>(HBaseResourceStore.java:88)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.kylin.common.persistence.ResourceStore.createResourceStore(ResourceStore.java:92)
at org.apache.kylin.common.persistence.ResourceStore.getStore(ResourceStore.java:111)
at org.apache.kylin.rest.service.AclTableMigrationTool.checkIfNeedMigrate(AclTableMigrationTool.java:99)
at org.apache.kylin.tool.AclTableMigrationCLI.main(AclTableMigrationCLI.java:43)
主机名本应该是hadoop1,被误认为是hadoop1:2181。解决方法有两种:
- 在kylin.properties文件中添加
kylin.env.zookeeper-connect-string=master:2181 - 修改hbase-site.xml,对于hbase.zookeeper.quorum这个配置,只写主机名称即可。
2.异常分析:kylin在build报错10020拒绝链接错误_记录
错误如下:java.net.ConnectException: Call From hadoop2/192.168.80.202 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: 拒绝连接;
org.apache.kylin.engine.mr.exception.MapReduceException: Exception: java.net.ConnectException: Call From hadoop2/192.168.80.202 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
java.net.ConnectException: Call From hadoop2/192.168.80.202 to 0.0.0.0:10020 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.kylin.engine.mr.common.MapReduceExecutable.doWork(MapReduceExecutable.java:173)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:70)
at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:164)
at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:113)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

在使用kylin的时候需要开启历史服务器,10020是mr链接历史服务器的端口
执行命令 开启 $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
jps检测是否有JobHistoryServer进程
在actions中选择resume,让中断的任务重新执行。
2577

被折叠的 条评论
为什么被折叠?



