1.前提条件是:第一步应该在本地安装好 jdk 、 maven 、 python的基础环境。
(注意:这里使用的python最好使用python 2.X。因为使用python 3.X会出现一个问题。这里不做说明)
2.在DataX的官网上下载压缩包,然后解压到自定义的文件夹里面。
3. win +R 输入cmd命令。进入命令行模式。
然后进入安装的DataX的bin目录下。输入python datax.py …/job/json/job.json命令。(注意:这里我在job文件夹下添加了一个json的文件夹。默认是没有json文件夹。你们根据自己的文件路径输入命令即可)
4.如果出现乱码,在命令行输入 CHCP 65001 ,并且重新执行命令python datax.py …/job/json/job.json
成功运行!!
5.基本使用
5.1从stream读取数据并打印到控制台
首先查看官方的json配置模板
//查看streamreader-->streamwriter模板
E:\DataX\datax\bin>datax.py -r streamreader -w streamwriter
//模板如下:
DataX (DATAX-OPENSOURCE-3.0), From Alibaba !
Copyright (C) 2010-2017, Alibaba Group. All Rights Reserved.
Please refer to the streamreader document:
https://github.com/alibaba/DataX/blob/master/streamreader/doc/streamreader.md
Please refer to the streamwriter document:
https://github.com/alibaba/DataX/blob/master/streamwriter/doc/streamwriter.md
Please save the following configuration as a json file and use
python {
DATAX_HOME}/bin/datax.py {
JSON_FILE_NAME}.json
to run the job.
{
"job": {
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"column": [],
"sliceRecordCount": ""
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"encoding": "",
"print": true
}
}
}
],
"setting": {
"speed": {
"channel"