安装部分
1、下载Livy安装包,如livy-0.5.0-incubating-bin.zip
2、上传到/opt/livy目录
3、解压安装包 unzip livy-0.5.0-incubating-bin.zip
4、配置Livy使用需要的spark环境变量
export SPARK_HOME=/usr/lib/spark
export HADOOP_CONF_DIR=/etc/hadoop/conf
5、启动livy
cd /opt/livy/livy-0.5.0-incubating-bin/bin
./livy-server start
Livy默认使用端口8998,可以查看http://<IP>:8998/ui
测试部分(通过livy使用spark-shell)
1、创建pyspark会话
[root@quickstart ~]# curl -X POST --data '{"kind": "pyspark"}' -H "Content-Type:application/json" localhost:8998/sessions
{"id":1,"appId":null,"owner":null,"proxyUser":null,"state":"starting","kind":"pyspark","appInfo":{"driverLogUrl":null,"sparkUiUrl":null},"log":["stdout: ","\nstderr: "]}
2、查看会话状态
[root@quickstart ~]# curl localhost:8998/sessions/1 | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
107 1290 107 1290 0 0 43949 0 --:--:-- --:--:-- --:--:-- 75882
{
"appId": null,
"appInfo": {
"driverLogUrl": null,
"sparkUiUrl": null
},
"id": 1,
"kind": "pyspark",
"log": [
"18/11/25 22:54:38 INFO spark.SparkContext: Added JAR file:/opt/livy/livy-0.5.0-incubating-bin/repl_2.10-jars/livy-repl_2.10-0.5.0-incubating.jar at spark://192.168.64.154:55570/jars/livy-repl_2.10-0.5.0-incubating.jar with timestamp 1543215278500",
"18/11/25 22:54:38 INFO executor.Executor: Starting executor ID driver on host localhost",
"18/11/25 22:54:38 INFO executor.Executor: Using REPL class URI: spark://192.168.64.154:55570/classes",
"18/11/25 22:54:38 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 51955.",
"18/11/25 22:54:38 INFO netty.NettyBlockTransferService: Server created on 51955",
"18/11/25 22:54:38 INFO storage.BlockManagerMaster: Trying to register BlockManager",
"18/11/25 22:54:38 INFO storage.BlockManagerMasterEndpoint: Registering block manager localhost:51955 with 534.5 MB RAM, BlockManagerId(driver, localhost, 51955)",
"18/11/25 22:54:38 INFO storage.BlockManagerMaster: Registered BlockManager",
"18/11/25 22:54:38 INFO driver.SparkEntries: Spark context finished initialization in 6592ms",
"18/11/25 22:54:41 INFO driver.SparkEntries: Created SQLContext."
],
"owner": null,
"proxyUser": null,
"state": "idle"
}
state为idle表示该会话存活,已经准备就绪,可以向该会话提交任务了
3、提交任务
[root@quickstart ~]# curl localhost:8998/sessions/1/statements -X POST -H 'Content-Type: application/json' -d '{"code":"1 + 1"}'
{"id":0,"code":"1 + 1","state":"waiting","output":null,"progress":0.0}
从返回可以看出,生成了一个statement,id为0,我们可以查看该statement的运行状态以及结果。
4、查看任务结果
[root@quickstart ~]# curl localhost:8998/sessions/1/statements/0
{"id":0,"code":"1 + 1","state":"available","output":{"status":"ok","execution_count":0,"data":{"text/plain":"2"}},"progress":1.0}
5、继续提交任务
[root@quickstart ~]# curl localhost:8998/sessions/1/statements -X POST -H 'Content-Type: application/json' -d '{"code":"a = 10"}'
{"id":1,"code":"a = 10","state":"available","output":{"status":"ok","execution_count":1,"data":{"text/plain":""}},"progress":1.0}
从返回可以看出,生成了一个statement,id为1
6、执行a+1操作
[root@quickstart ~]# curl localhost:8998/sessions/1/statements -X POST -H 'Content-Type: application/json' -d '{"code":"a + 1"}'
{"id":2,"code":"a + 1","state":"available","output":{"status":"ok","execution_count":2,"data":{"text/plain":"11"}},"progress":1.0}
从返回可以看出,生成了一个statement,id为2,执行的代码为a+1,返回的结果为11
也可以通过如下接口查看执行结果
[root@quickstart bin]# curl localhost:8998/sessions/1/statements/2
{"id":2,"code":"a + 1","state":"available","output":{"status":"ok","execution_count":2,"data":{"text/plain":"11"}},"progress":1.0}
其它
1、通过livy还可以提交spark代码、执行python文件等,详细请参考 https://blog.youkuaiyun.com/dockj/article/details/53328800
2、在livy界面中查看session列表
3、查看session 1 的详细信息
4、在postman中通过rest API查看