DATA-HUB 安装与启动:

一、docker 镜像安装(与本地项目无关)

datahub环境复杂, 不用全部在本地安装

*注意指定python版本, docker有可能不兼容python3.10

/opt/homebrew/anaconda3/bin/python3.12 -m pip install --upgrade pip wheel setuptools
/opt/homebrew/anaconda3/bin/python3.12 -m pip install --upgrade acryl-datahub
/opt/homebrew/anaconda3/bin/python3.12 -m datahub version
/opt/homebrew/anaconda3/bin/python3.12 -m datahub docker quickstart
/opt/homebrew/anaconda3/bin/python3.12 -m datahub docker ingest-sample-data

一、本地安装:

1.2、主要参考:

Local Development | DataHub

Local Development | DataHub

1.1、前置依赖

# Install Java
brew install openjdk@17

# Install Python(特别重要, 版本不对,后面的构建编译环节会很多问题, 不通过,粗设置好PATH的python环境变量)
brew install python@3.10  # you may need to add this to your PATH
# alternatively, you can use pyenv to manage your python versions

# Install docker and docker compose
brew install --cask docker

1.2、重git上克隆代码

主分枝:

git clone https://github.com/datahub-project/datahub.git;

国际化的分支:

git clone https://github.com/luizhsalazar/datahub.git

执行结果:

(base) phoenix@phoenixdeMacBook-Pro ~ % git clone https://github.com/datahub-project/datahub.git;
正克隆到 'datahub'...
remote: Enumerating objects: 4477163, done.
remote: Counting objects: 100% (15782/15782), done.
remote: Compressing objects: 100% (1209/1209), done.
remote: Total 4477163 (delta 14145), reused 14814 (delta 13277), pack-reused 4461381 (from 2)
接收对象中: 100% (4477163/4477163), 6.39 GiB | 17.23 MiB/s, 完成.
处理 delta 中: 100% (2102360/2102360), 完成.
正在更新文件: 100% (8581/8581), 完成.

1.3、构建编译项目:

使用gradle wrapper构建整个项目:

切换到存储库的根目录:

请注意,上述操作还将运行测试和一些验证,这会使过程变得相当慢。

cd datahub
./gradlew build

建议根据您的需要部分编译DataHub:

  • 构建Datahub的后端GMS(通用元数据服务):

    ./gradlew :metadata-service:war:build

  • 构建数据中心的前端:

    ./gradlew :datahub-frontend:dist -x yarnTest -x yarnLint
  • 构建DataHub的命令行工具:

    ./gradlew :metadata-ingestion:installDev
  • 构建DataHub的文档:

    ./gradlew :docs-website:yarnLintFix :docs-website:build -x :metadata-ingestion:runPreFlightScript
    # To preview the documentation
    ./gradlew :docs-website:serve

这个教你怎么安装插件:

DataHub安装配置详细过程_datahub部署-优快云博客

2、安装python3.10作为datahub的python环境:

2.1、安装:

brew install python@3.10;(这个很重要, 系统中有更高python版本, 在./gradlew build时, 会自引用python12/13, ) 参考$PATH环境变量设置。

2.2、设置mac默认的python环境

open -e  ~/.bash_profile;   打开启动文件

open -e ~/.zshrc;   打开启动文件

brew list python@3.10;  查看3.10安装路径

把上面两个文件改成一样:


# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/opt/homebrew/anaconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/opt/homebrew/anaconda3/etc/profile.d/conda.sh" ]; then
        . "/opt/homebrew/anaconda3/etc/profile.d/conda.sh"
    else
#        export PATH="/opt/homebrew/anaconda3/bin:$PATH"
	 export PATH="/opt/homebrew/bin:$PATH"

    fi
fi
unset __conda_setup
# <<< conda initialize <<<


#python3.12
alias python3='/opt/homebrew/Cellar/python@3.10/3.10.16/bin/python3.10' 
alias python=python3

1.x 清理 ./gradlew clear;

Starting a Gradle Daemon, 1 busy Daemon could not be reused, use --status for details
Configuration on demand is an incubating feature.
<-------------> 1% CONFIGURING [1m 43s]
<-------------> 1% CONFIGURING [2m 42s]
<-------------> 1% CONFIGURING [2m 49s]f classpath > gradle-node-plugin-7.0.2.pom
<-------------> 1% CONFIGURING [2m 50s]f classpath > gradle-nexus-staging-plugin-0.30.0.pom
> root project > Resolve dependencies of classpath > gradle-nexus-staging-plugin-0.30.0.pom


<-------------> 1% CONFIGURING [2m 51s]


> Configure project :datahub-frontend
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :datahub-upgrade
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :docker
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :smoke-test
Root directory:  /Users/phoenix/datahub

> Configure project :docker:datahub-ingestion
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :docker:datahub-ingestion-base
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :docker:elasticsearch-setup
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :docker:kafka-setup
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :docker:mysql-setup
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :docker:postgres-setup
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :metadata-jobs:mae-consumer-job
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :metadata-jobs:mce-consumer-job
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :metadata-service:configuration
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT

> Configure project :metadata-service:war
fullVersion=v0.15.0rc3-62-g5946558
cliMajorVersion=0.15.0rc3
version=0.15.0rc3-SNAPSHOT
[Incubating] Problems report is available at: file:///Users/phoenix/datahub/build/reports/problems/problems-report.html

FAILURE: Build failed with an exception.

* What went wrong:
Task 'clear' not found in root project 'datahub' and its subprojects. Some candidates are: 'clean'.

* Try:
> Run gradlew tasks to get a list of available tasks.
> For more on name expansion, please refer to https://docs.gradle.org/8.11.1/userguide/command_line_interface.html#sec:name_abbreviation in the Gradle documentation.
> Run with --stacktrace option to get the stack trace.
> Run with --info or --debug option to get more log output.
> Run with --scan to get full insights.
> Get more help at https://help.gradle.org.

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.11.1/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD FAILED in 8m 4s
2 actionable tasks: 2 up-to-date
(base) phoenix@phoenixdeMacBook-Pro datahub % 

、前端本地启动命令:(其他组建可以跑在docker上)

碰到问题:进入相应的虚拟环境:

python3 -m venv venv
source venv/bin/activate

python3 -m pip install --upgrade pip wheel setuptools

单独安装相应的包:python3 -m pip install pyarrow==11.0.0;  

/opt/homebrew/opt/python@3.13/bin/python3.13 -m venv /Users/phoenix/datahub/metadata-ingestion/venv pip install pyarrow==11.0.0;  

GIT国际化分支下载:

源码:

datahub/metadata-service at feature/ing-623 · datahub-project/datahub · GitHub

说明文档:

GitHub - luizhsalazar/datahub at feature/i18n-support

https://blog.datahubproject.io/how-we-implemented-internationalization-in-datahub-d3e9f6349a6a

本地启动frontend:

cd datahub-frontend/run && ./run-local-frontend

从源码中安装:

DataHub CLI | DataHub

二、插件安装

插件安装:

检查datahub插件:python3 -m datahub check plugins; 

安装插件命令:python3 -m pip3 install 'acryl-datahub[postgres]'

(备注: 本地环境安装, docker容器跑的时候也会安装, 但会因为网络问题, 容易超时错误,多跑几次, 直到安装成功就好了)

/opt/homebrew/anaconda3/bin/python3.12 -m datahub check plugins;

/opt/homebrew/anaconda3/bin/python3.12 -m pip3 install 'acryl-datahub[postgres]';

/opt/homebrew/anaconda3/bin/python3.12 -m pip install 'acryl-datahub[postgres]';

参考:

Datahub部署 | Datahub中文社区

插件安装:参考

安装datahub - 编程好6博客

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值