Atlas开发环境部署和生产环境部署
一、Atlas使用演示
1.1 概述
Atlas功能:提供数据管理和治理功能,用以构建其数据资产目录,对这些资产进行分类管理,形成数据字典。
特点:
1、提供强大的索引,能够对不同源不同库下的表进行快速的检索
2、提供血缘关系图,以清晰的看到表与表的来龙去脉。血缘关系做到了字段级别
3、提供分类、术语划分等对元数据再一次进行分类分组管理
4、对单个表的信息展示十分详细
5、初始化导入元数据后,后面无需二次操作便可自动增量同步元数据信息到Atlas
1.2使用演示
1.3架构概述

hive\scoop\hbase。。。为不同的数据源,目前支持Hive、HBase、Scoop、Storm、Kafka数据源
kafka为消息中间件,用来增量同步源数据源数据到Atlas中
atlas 核心架构有三大模块:
ingest\export:导入导出元数据信息。导入指将其他的数据源的元数据导入到Atlas;导出指将atlas中的元数据导出 到hbase.atlas本身不存储数据。导入的时候是增量导入,依赖kafka做消息队列。hive-->atlas 会通过部署hive-hook,由hive 的 hook程序,将其元数据信息导入到atlas中
type system:系统分类模块,对元数据进行分类。包括源不同、同源的库、表、操作语句、路径等进行分类
graph engine:图引擎,主要是表与表之间的血缘依赖图、字段与字段之间的血缘依赖图。这部分数据是通过图数据库 JanusGraph经过转换成k-v形式,最后存储到Hbase中
Hbase为Atlas中的导出目标,存储元数据信息
solr类似与ElasticSearch,建索引,提供全局检索功能,称为索引库
提供了http协议和rest方式的api接口,方便做二次开发
1.3 源码各模块作用
各模块作用:
二、addons
安装扩展组件源代码,主要是Atlas接入各种Hadoop元数据数据源的桥接代码,对应Atlas架构图中的部分:
比如:hive-bridge
hive桥接扩展模块,通过bin目录下的import-hive.sh脚本导入hive元数据到Atlas系统,脚本调用了桥接代码类HiveMetaStoreBridg
三、authorization
Atlas鉴权模块,支持Simple鉴权和Ranger鉴权两种方式,这个模块的详细介绍说明和使用说明见官方文档:Apache Atlas – Data Governance and Metadata framework for Hadoop
五、client
客户端API代码
client-v2包:V2版本客户端API代码,客户端调用Atlas API接口时可以直接调用这里封装的API接口方法,减轻代码开发工作量
七、dashboardv3
Atlas管理台UI前端应用,对应架构图中的Admin UI
九、distro
atlas分布式部署相关的一些开发配置文件
pom.xml中提供了一些图数据库存储Hbase和图数据库索引检索组件solr的默认配置
bin目录下提供atlas基本安装部署的Python脚本文件,比如启动、停止atlas服务等
conf下提供Atlas配置文件
另外主要有:Atlas应用配置文件atlas-application.properties、Atlas环境变量配置文件atlas-env.sh、日志配置文件atlas-log4j.xml,鉴权策略配置文件atlas-simple-authz-policy.json、用户认证配置文件users-credentials.properties
main/assemblies目录下是打包相关的描述符配置文件
十九、webapp
Atlas Web应用模块
其中Atlas类:Atlas单机部署启动服务驱动类
二、Atlas 本地编译部署
1、下载源码
https://atlas.apache.org/#/Downloads (选择不同版本下载source源码,本次是2.2.0最新稳定版本)
2、编译源码
1、配置maven内存设置环境变量,否则在编译webapp模块时因内存超出而报错
MAVEN_OPTS=-Xms512m -Xmx1024m
2、在 主pom:apache-atlas(root)文件中: 如果这地方报错,则修改
<requireMavenVersion>
<version>3.8.4</version> #指定具体的版本
</requireMavenVersion>
<requireJavaVersion>
<level>ERROR</level>
<version>1.8.0_45</version> #指定具体的版本
</requireJavaVersion>
<requireJavaVersion>
<level>WARN</level>
<version>1.8.0_45</version> #指定具体的版本
<dependency>
<groupId>org.apache.atlas</groupId>
<artifactId>atlas-buildtools</artifactId>
<version>2.2.0</version> #改为2.2.0版本
</dependency>
<!-- <configuration>-->
<!-- <deployAtEnd>true</deployAtEnd>-->
<!-- </configuration>-->
3、编译stom-bridge 和storm-bridge-shim模块的时候可能因为需要引用外部的资源而下载不过来,可以关掉不加载,这个是编译数据源模块可以跳过,不影响服务的正常执行。同时在其他地方引用的位置要注掉:
对于atlas-distro.pom
<!-- <dependencies>-->
<!-- <dependency>-->
<!-- <groupId>org.apache.atlas</groupId>-->
<!-- <artifactId>storm-bridge</artifactId>--> #注掉
<!-- </dependency>-->
<!-- </dependencies>-->
以及
<artifactId>maven-assembly-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>single</goal>
</goals>
<phase>package</phase>
<configuration>
<skipAssembly>false</skipAssembly>
<descriptors>
<!--<descriptor>src/main/assemblies/atlas-storm-hook-package.xml</descriptor>--> #注掉
对于apache.atlas.pom
<!-- <module>addons/storm-bridge-shim</module>--> #注掉
<!-- <module>addons/storm-bridge</module>-->
以及
<!-- <dependency>--> #注掉
<!-- <groupId>org.apache.atlas</groupId>-->
<!-- <artifactId>storm-bridge</artifactId>-->
<!-- <version>${project.version}</version>-->
<!-- </dependency>-->
以及
<!-- <dependency>--> #注掉
<!-- <groupId>org.apache.atlas</groupId>-->
<!-- <artifactId>storm-bridge-shim</artifactId>-->
<!-- <version>${project.version}</version>-->
<!-- </dependency>-->
最后执行编译命令:
使用外部模式编译(hbase和solr使用外部,本次编译使用外部方式)
mvn clean -DskipTests package -Pdist
编译成功后:
[INFO] Apache Atlas Server Build Tools .................... SUCCESS [ 1.066 s]
[INFO] apache-atlas ....................................... SUCCESS [ 4.884 s]
[INFO] Apache Atlas Integration ........................... SUCCESS [ 9.549 s]
[INFO] Apache Atlas Test Utility Tools .................... SUCCESS [ 3.712 s]
[INFO] Apache Atlas Common ................................ SUCCESS [ 4.260 s]
[INFO] Apache Atlas Client ................................ SUCCESS [ 0.483 s]
[INFO] atlas-client-common ................................ SUCCESS [ 1.448 s]
[INFO] atlas-client-v1 .................................... SUCCESS [ 2.249 s]
[INFO] Apache Atlas Server API ............................ SUCCESS [ 2.715 s]
[INFO] Apache Atlas Notification .......................... SUCCESS [ 5.100 s]
[INFO] atlas-client-v2 .................................... SUCCESS [ 1.530 s]
[INFO] Apache Atlas Graph Database Projects ............... SUCCESS [ 0.470 s]
[INFO] Apache Atlas Graph Database API .................... SUCCESS [ 1.720 s]
[INFO] Graph Database Common Code ......................... SUCCESS [ 1.989 s]
[INFO] Apache Atlas JanusGraph-HBase2 Module .............. SUCCESS [ 1.608 s]
[INFO] Apache Atlas JanusGraph DB Impl .................... SUCCESS [ 7.949 s]
[INFO] Apache Atlas Graph DB Dependencies ................. SUCCESS [ 1.929 s]
[INFO] Apache Atlas Authorization ......................... SUCCESS [ 3.133 s]
[INFO] Apache Atlas Repository ............................ SUCCESS [ 20.009 s]
[INFO] Apache Atlas UI .................................... SUCCESS [01:53 min]
[INFO] Apache Atlas New UI ................................ SUCCESS [01:48 min]
[INFO] Apache Atlas Web Application ....................... SUCCESS [04:45 min]
[INFO] Apache Atlas Documentation ......................... SUCCESS [ 4.515 s]
[INFO] Apache Atlas FileSystem Model ...................... SUCCESS [ 5.743 s]
[INFO] Apache Atlas Plugin Classloader .................... SUCCESS [ 3.340 s]
[INFO] Apache Atlas Hive Bridge Shim ...................... SUCCESS [ 5.995 s]
[INFO] Apache Atlas Hive Bridge ........................... SUCCESS [ 20.976 s]
[INFO] Apache Atlas Falcon Bridge Shim .................... SUCCESS [ 2.760 s]
[INFO] Apache Atlas Falcon Bridge ......................... SUCCESS [ 9.204 s]
[INFO] Apache Atlas Sqoop Bridge Shim ..................... SUCCESS [ 0.859 s]
[INFO] Apache Atlas Sqoop Bridge .......................... SUCCESS [ 15.889 s]
[INFO] Apache Atlas Hbase Bridge Shim ..................... SUCCESS [ 5.268 s]
[INFO] Apache Atlas Hbase Bridge .......................... SUCCESS [ 16.576 s]
[INFO] Apache HBase - Testing Util ........................ SUCCESS [ 11.497 s]
[INFO] Apache Atlas Kafka Bridge .......................... SUCCESS [ 7.014 s]
[INFO] Apache Atlas classification updater ................ SUCCESS [ 3.903 s]
[INFO] Apache Atlas index repair tool ..................... SUCCESS [ 5.946 s]
[INFO] Apache Atlas Impala Hook API ....................... SUCCESS [ 0.860 s]
[INFO] Apache Atlas Impala Bridge Shim .................... SUCCESS [ 1.138 s]
[INFO] Apache Atlas Impala Bridge ......................... SUCCESS [ 14.205 s]
[INFO] Apache Atlas Distribution .......................... SUCCESS [01:23 min]
[INFO] atlas-examples ..................................... SUCCESS [ 0.452 s]
[INFO] sample-app ......................................... SUCCESS [ 1.933 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 13:25 min
[INFO] Finished at: 2022-06-10T11:05:05+08:00
[INFO] ------------------------------------------------------------------------
使用内嵌hbase和solr模式编译:mvn clean -DskipTests package -Pdist,embedded-hbase-solr
注:1.内嵌模式会从apache官网apache下载hbase和solr,如果网络比较慢,可以将对应版本的安装包直接放到对应目录下
2.使用内嵌模式可能会因为windows下无法启动hadoop,导致无法启动hbase,最后服务启动失败
3、开发环境下Atlas配置
编译成功之后,会将编译生成的文件放到\apache-atlas-sources-2.2.0\distro 目录下
=任意位置新建目录:如D:\ext.wangwentao5\all_paojects\atlas-deploy2
在这个目录下新建目录结构如下:
atlas-deploy2:
conf
data
hbase
logs
models
solr
webapp
然后:
将\apache-atlas-sources-2.2.0\distro\target\conf 下的文件全部复制到atlas-deploy2\conf中
将\apache-atlas-sources-2.2.0\webapp\target\web-app2.2.0下的文件复制到\atlas-deploy2\webapp\atlas下
配置Atlas:
注:由于windows环境下安装部署大数据组件Hadoop、Hbase等有诸多限制,所以使用外部hbase和solr的方式,在linux平台启动这些组件
在源码中apache-atlas-sources-2.2.0\distro\src\conf 和自建文件夹\atlas-deploy2\conf中均进行一样的修改:
修改atlas-application.properties
(注:192.168.182.10:2181,192.168.182.20:2181,192.168.182.30:2181为外部虚拟机中的zookeeper地址)
-1-
atlas.graph.storage.hostname=192.168.182.10:2181,192.168.182.20:2181,192.168.182.30:2181
-2-
atlas.graph.index.search.solr.zookeeper-url=192.168.182.10:2181,192.168.182.20:2181,192.168.182.30:2181
-3-
atlas.kafka.data=/opt/apps/kafka_2.11-2.4.1/data ##linux中的Kafka数据存储路径
atlas.kafka.zookeeper.connect=192.168.182.10:2181,192.168.182.20:2181,192.168.182.30:2181/kafka
atlas.kafka.bootstrap.servers=192.168.182.10:9092,192.168.182.20:9092,192.168.182.30:9092
-4-端口21000
atlas.server.http.port=21000
-5-ip地址
atlas.rest.address=http://localhost:21000
-6-关闭每次开启是初始化atlas
atlas.server.run.setup.on.start=false
-7-hbase元数据对应的zk地址
atlas.audit.hbase.zookeeper.quorum=192.168.182.10:2181,192.168.182.20:2181,192.168.182.30:2181
修改atlas-env.sh
export MANAGE_LOCAL_HBASE=false ##关闭内嵌hbase模式
export MANAGE_LOCAL_SOLR=false ##关闭内嵌solr模式
export MANAGE_EMBEDDED_CASSANDRA=false ##指示 cassandra 是否是 Atlas 的嵌入式后端
export MANAGE_LOCAL_ELASTICSEARCH=false ##指示是否应该为 Atlas 启动 Elasticsearch 的本地实例
export HBASE_CONF_DIR=/opt/apps/hbase-2.0.5/conf ##linux中hbase的conf地址
atlas-application.properties 文件:
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
######### Graph Database Configs #########
# Graph Database
#Configures the graph database to use. Defaults to JanusGraph
#atlas.graphdb.backend=org.apache.atlas.repository.graphdb.janus.AtlasJanusGraphDatabase
# Graph Storage
# Set atlas.graph.storage.backend to the correct value for your desired storage
# backend. Possible values:
#
# hbase
# cassandra
# embeddedcassandra - Should only be set by building Atlas with -Pdist,embedded-cassandra-solr
# berkeleyje
#
# See the configuration documentation for more information about configuring the various storage backends.
#
atlas.graph.storage.backend=hbase2
atlas.graph.storage.hbase.table=apache_atlas_janus
#Hbase
#For standalone mode , specify localhost
#for distributed mode, specify zookeeper quorum here
atlas.graph.storage.hostname=192.168.182.10:2181,192.168.182.20:2181,192.168.182.30:2181
atlas.graph.storage.hbase.regions-per-server=1
# Gremlin Query Optimizer
#
# Enables rewriting gremlin queries to maximize performance. This flag is provided as
# a possible way to work around any defects that are found in the optimizer until they
# are resolved.
#atlas.query.gremlinOptimizerEnabled=true
# Delete handler
#
# This allows the default behavior of doing "soft" deletes to be changed.
#
# Allowed Values:
# org.apache.atlas.repository.store.graph.v1.SoftDeleteHandlerV1 - all deletes are "soft" deletes
# org.apache.atlas.repository.store.graph.v1.HardDeleteHandlerV1 - all deletes are "hard" deletes
#
#atlas.DeleteHandlerV1.impl=org.apache.atlas.repository.store.graph.v1.SoftDeleteHandlerV1
# Entity audit repository
#
# This allows the default behavior of logging entity changes to hbase to be changed.
#
# Allowed Values:
# org.apache.atlas.repository.audit.HBaseBasedAuditRepository - log entity changes to hbase
# org.apache.atlas.repository.audit.CassandraBasedAuditRepository - log entity changes to cassandra
# org.apache.atlas.repository.audit.NoopEntityAuditRepository - disable the audit repository
#
atlas.EntityAuditRepository.impl=org.apache.atlas.repository.audit.HBaseBasedAuditRepository
# if Cassandra is used as a backend for audit from the above property, uncomment and set the following
# properties appropriately. If using the embedded cassandra profile, these properties can remain
# commented out.
# atlas.EntityAuditRepository.keyspace=atlas_audit
# atlas.EntityAuditRepository.replicationFactor=1
# Graph Search Index
atlas.graph.index.search.backend=solr
#Solr
#Solr cloud mode properties
atlas.graph.index.search.solr.mode=cloud
atlas.graph.index.search.solr.zookeeper-url=192.168.182.10:2181,192.168.182.20:2181,192.168.182.30:2181
atlas.graph.index.search.solr.zookeeper-connect-timeout=60000
atlas.graph.index.search.solr.zookeeper-session-timeout=60000
atlas.graph.index.search.solr.wait-searcher=false
#Solr http mode properties
#atlas.graph.index.search.solr.mode=http
#atlas.graph.index.search.solr.http-urls=http://localhost:8983/solr
# Solr-specific configuration property
atlas.graph.index.search.max-result-set-size=150
######### Import Configs #########
#atlas.import.temp.directory=/temp/import
######### Notification Configs #########
atlas.notification.embedded=false
atlas.kafka.data=/opt/apps/kafka_2.11-2.4.1/data
atlas.kafka.zookeeper.connect=192.168.182.10:2181,192.168.182.20:2181,192.168.182.30:2181/kafka
atlas.kafka.bootstrap.servers=192.168.182.10:9092,192.168.182.20:9092,192.168.182.30:9092
atlas.kafka.zookeeper.session.timeout.ms=400
atlas.kafka.zookeeper.connection.timeout.ms=200
atlas.kafka.zookeeper.sync.time.ms=20
atlas.kafka.auto.commit.interval.ms=1000
atlas.kafka.hook.group.id=atlas
atlas.kafka.enable.auto.commit=false
atlas.kafka.auto.offset.reset=earliest
atlas.kafka.session.timeout.ms=30000
atlas.kafka.offsets.topic.replication.factor=1
atlas.kafka.poll.timeout.ms=1000
atlas.notification.create.topics=true
atlas.notification.replicas=1
atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES
atlas.notification.log.failed.messages=true
atlas.notification.consumer.retry.interval=500
atlas.notification.hook.retry.interval=1000
# Enable for Kerberized Kafka clusters
#atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM
#atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab
## Server port configuration
#atlas.server.http.port=21000
#atlas.server.https.port=21443
######### Security Properties #########
# SSL config
atlas.enableTLS=false
#truststore.file=/path/to/truststore.jks
#cert.stores.credential.provider.path=jceks://file/path/to/credentialstore.jceks
#following only required for 2-way SSL
#keystore.file=/path/to/keystore.jks
# Authentication config
atlas.authentication.method.kerberos=false
atlas.authentication.method.file=true
#### ldap.type= LDAP or AD
atlas.authentication.method.ldap.type=none
#### user credentials file
atlas.authentication.method.file.filename=${sys:atlas.home}/conf/users-credentials.properties
### groups from UGI
#atlas.authentication.method.ldap.ugi-groups=true
######## LDAP properties #########
#atlas.authentication.method.ldap.url=ldap://<ldap server url>:389
#atlas.authentication.method.ldap.userDNpattern=uid={0},ou=People,dc=example,dc=com
#atlas.authentication.method.ldap.groupSearchBase=dc=example,dc=com
#atlas.authentication.method.ldap.groupSearchFilter=(member=uid={0},ou=Users,dc=example,dc=com)
#atlas.authentication.method.ldap.groupRoleAttribute=cn
#atlas.authentication.method.ldap.base.dn=dc=example,dc=com
#atlas.authentication.method.ldap.bind.dn=cn=Manager,dc=example,dc=com
#atlas.authentication.method.ldap.bind.password=<password>
#atlas.authentication.method.ldap.referral=ignore
#atlas.authentication.method.ldap.user.searchfilter=(uid={0})
#atlas.authentication.method.ldap.default.role=<default role>
######### Active directory properties #######
#atlas.authentication.method.ldap.ad.domain=example.com
#atlas.authentication.method.ldap.ad.url=ldap://<AD server url>:389
#atlas.authentication.method.ldap.ad.base.dn=(sAMAccountName={0})
#atlas.authentication.method.ldap.ad.bind.dn=CN=team,CN=Users,DC=example,DC=com
#atlas.authentication.method.ldap.ad.bind.password=<password>
#atlas.authentication.method.ldap.ad.referral=ignore
#atlas.authentication.method.ldap.ad.user.searchfilter=(sAMAccountName={0})
#atlas.authentication.method.ldap.ad.default.role=<default role>
######### JAAS Configuration ########
#atlas.jaas.KafkaClient.loginModuleName = com.sun.security.auth.module.Krb5LoginModule
#atlas.jaas.KafkaClient.loginModuleControlFlag = required
#atlas.jaas.KafkaClient.option.useKeyTab = true
#atlas.jaas.KafkaClient.option.storeKey = true
#atlas.jaas.KafkaClient.option.serviceName = kafka
#atlas.jaas.KafkaClient.option.keyTab = /etc/security/keytabs/atlas.service.keytab
#atlas.jaas.KafkaClient.option.principal = atlas/_HOST@EXAMPLE.COM
######### Server Properties #########
atlas.rest.address=http://192.168.182.10:21000
# If enabled and set to true, this will run setup steps when the server starts
atlas.server.run.setup.on.start=false
######### Entity Audit Configs #########
atlas.audit.hbase.tablename=apache_atlas_entity_audit
atlas.audit.zookeeper.session.timeout.ms=1000
atlas.audit.hbase.zookeeper.quorum=192.168.182.10:2181,192.168.182.20:2181,192.168.182.30:2181
######### High Availability Configuration ########
atlas.server.ha.enabled=false
#### Enabled the configs below as per need if HA is enabled #####
#atlas.server.ids=id1
#atlas.server.address.id1=localhost:21000
#atlas.server.ha.zookeeper.connect=localhost:2181
#atlas.server.ha.zookeeper.retry.sleeptime.ms=1000
#atlas.server.ha.zookeeper.num.retries=3
#atlas.server.ha.zookeeper.session.timeout.ms=20000
## if ACLs need to be set on the created nodes, uncomment these lines and set the values ##
#atlas.server.ha.zookeeper.acl=<scheme>:<id>
#atlas.server.ha.zookeeper.auth=<scheme>:<authinfo>
######### Atlas Authorization #########
atlas.authorizer.impl=simple
atlas.authorizer.simple.authz.policy.file=atlas-simple-authz-policy.json
######### Type Cache Implementation ########
# A type cache class which implements
# org.apache.atlas.typesystem.types.cache.TypeCache.
# The default implementation is org.apache.atlas.typesystem.types.cache.DefaultTypeCache which is a local in-memory type cache.
#atlas.TypeCache.impl=
######### Performance Configs #########
#atlas.graph.storage.lock.retries=10
#atlas.graph.storage.cache.db-cache-time=120000
######### CSRF Configs #########
atlas.rest-csrf.enabled=true
atlas.rest-csrf.browser-useragents-regex=^Mozilla.*,^Opera.*,^Chrome.*
atlas.rest-csrf.methods-to-ignore=GET,OPTIONS,HEAD,TRACE
atlas.rest-csrf.custom-header=X-XSRF-HEADER
############ KNOX Configs ################
#atlas.sso.knox.browser.useragent=Mozilla,Chrome,Opera
#atlas.sso.knox.enabled=true
#atlas.sso.knox.providerurl=https://<knox gateway ip>:8443/gateway/knoxsso/api/v1/websso
#atlas.sso.knox.publicKey=
############ Atlas Metric/Stats configs ################
# Format: atlas.metric.query.<key>.<name>
atlas.metric.query.cache.ttlInSecs=900
#atlas.metric.query.general.typeCount=
#atlas.metric.query.general.typeUnusedCount=
#atlas.metric.query.general.entityCount=
#atlas.metric.query.general.tagCount=
#atlas.metric.query.general.entityDeleted=
#
#atlas.metric.query.entity.typeEntities=
#atlas.metric.query.entity.entityTagged=
#
#atlas.metric.query.tags.entityTags=
######### Compiled Query Cache Configuration #########
# The size of the compiled query cache. Older queries will be evicted from the cache
# when we reach the capacity.
#atlas.CompiledQueryCache.capacity=1000
# Allows notifications when items are evicted from the compiled query
# cache because it has become full. A warning will be issued when
# the specified number of evictions have occurred. If the eviction
# warning threshold <= 0, no eviction warnings will be issued.
#atlas.CompiledQueryCache.evictionWarningThrottle=0
######### Full Text Search Configuration #########
#Set to false to disable full text search.
#atlas.search.fulltext.enable=true
######### Gremlin Search Configuration #########
#Set to false to disable gremlin search.
atlas.search.gremlin.enable=false
########## Add http headers ###########
#atlas.headers.Access-Control-Allow-Origin=*
#atlas.headers.Access-Control-Allow-Methods=GET,OPTIONS,HEAD,PUT,POST
#atlas.headers.<headerName>=<headerValue>
######### UI Configuration ########
atlas.ui.default.version=v1
atlas-env.sh 文件:
#!/usr/bin/env bash
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# The java implementation to use. If JAVA_HOME is not found we expect java and jar to be in path
#export JAVA_HOME=
# any additional java opts you want to set. This will apply to both client and server operations
#export ATLAS_OPTS=
# any additional java opts that you want to set for client only
#export ATLAS_CLIENT_OPTS=
# java heap size we want to set for the client. Default is 1024MB
#export ATLAS_CLIENT_HEAP=
# any additional opts you want to set for atlas service.
#export ATLAS_SERVER_OPTS=
# indicative values for large number of metadata entities (equal or more than 10,000s)
#export ATLAS_SERVER_OPTS="-server -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+PrintTenuringDistribution -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dumps/atlas_server.hprof -Xloggc:logs/gc-worker.log -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1m -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCTimeStamps"
# java heap size we want to set for the atlas server. Default is 1024MB
#export ATLAS_SERVER_HEAP=
# indicative values for large number of metadata entities (equal or more than 10,000s) for JDK 8
#export ATLAS_SERVER_HEAP="-Xms15360m -Xmx15360m -XX:MaxNewSize=5120m -XX:MetaspaceSize=100M -XX:MaxMetaspaceSize=512m"
# What is is considered as atlas home dir. Default is the base locaion of the installed software
#export ATLAS_HOME_DIR=
# Where log files are stored. Defatult is logs directory under the base install location
#export ATLAS_LOG_DIR=
# Where pid files are stored. Defatult is logs directory under the base install location
#export ATLAS_PID_DIR=
# where the atlas titan db data is stored. Defatult is logs/data directory under the base install location
#export ATLAS_DATA_DIR=
# Where do you want to expand the war file. By Default it is in /server/webapp dir under the base install dir.
#export ATLAS_EXPANDED_WEBAPP_DIR=
# indicates whether or not a local instance of HBase should be started for Atlas
export MANAGE_LOCAL_HBASE=false
# indicates whether or not a local instance of Solr should be started for Atlas
export MANAGE_LOCAL_SOLR=false
# indicates whether or not cassandra is the embedded backend for Atlas
export MANAGE_EMBEDDED_CASSANDRA=false
# indicates whether or not a local instance of Elasticsearch should be started for Atlas
export MANAGE_LOCAL_ELASTICSEARCH=false
4、启动Atlas所需组件
注:要提前在机器中安装配置好所需组件。包括CentOS环境、jdk、hadoop、mysql、zookeeper、kafka、hbase 、solr..
启动 hadoop
启动yarn
启动zk
启动kafka
启动hbase
启动solr
查看端口验证服务是否全都正常开启:
=============== linux01 ===============
27842 NameNode
30035 HRegionServer
28646 NodeManager
29115 QuorumPeerMain
29531 Kafka
29851 HMaster
28510 ResourceManager
30591 jar
=============== linux02 ===============
12321 HRegionServer
12165 Kafka
11766 QuorumPeerMain
12679 jar
11080 SecondaryNameNode
11003 DataNode
11435 NodeManager
=============== linux03 ===============
5008 NodeManager
4833 DataNode
5601 Kafka
5766 HRegionServer
5209 QuorumPeerMain
6124 jar
5、本地源码启动Atlas
服务驱动类:webapp模块下:org.apache.atlas.Atlas
配置VM参数:
-Datlas.home=D:\ext.wangwentao5\all_paojects\atlas-deploy2
-Datlas.conf=D:\ext.wangwentao5\all_paojects\atlas-deploy2\conf
-Datlas.data=D:\ext.wangwentao5\all_paojects\atlas-deploy2\data
-Datlas.log.dir=D:\ext.wangwentao5\all_paojects\atlas-deploy2\logs
-Dlog4j.configuration=atlas-log4j.xml
-Djava.net.preferIPv4Stack=true
配置args参数:
--port 22000
--app
D:\ext.wangwentao5\all_paojects\atlas-deploy2\webapp\atlas
启动驱动服务类...................
web页面访问:localhost:21000
账号密码都是admin
至此开发环境部署完毕....................
6、接口调用演示
三、生产环境部署
前提:在集群环境中已安装:jdk、hadoop\mysql\zookeeper\kafka\hbase
1、solr安装
使用新用户启动solr
为了安全,创建新用户solr,使用solr用户来安装启动solr
创建用户solr:sudo useradd solr
给用户solr设置密码solr:echo solr | sudo passwd --stdin solr
上传solr安装包-解压
修改solr组件的用户为solr
sudo chown -R solr:solr /opt/apps/solr-7.7.3/
集群安装:修改配置-最后分发
bin/solr.in.sh
ZK_HOST="linux01:2181,linux02:2181,linux03:2181"
分发之后修改用户:sudo chown -R solr:solr /opt/apps/solr-7.7.3/
启动:
以solr用户启动solr 。每个节点都要启动
sudo -i -u solr /opt/apps/solr-7.7.3/bin/solr start
启动成功验证:
出现: Started Solr server on port 8983 (pid=20045). Happy searching!
网页端口 8983
2、atlas安装
atlas不提供安装包,只提供源码包,所以要自己进行编译,使用编译后的安装包进行安装
atlas编译后,会有很多的安装包,其中带hook的是与其他组件进行整合的连接包
在distro的target目录下找到server包进行解压安装
3、Atlas配置
3.1atlas—Hbase
atlas中的数据存储在hbase中
修改conf/atlas-application.properties
将图片信息元数据存在hbase中,而hbase的元数据是放在zk中的
atlas.graph.storage.hostname=linux01:2181,linux02:2181,linux03:2181
vi conf/atlas-env.sh
export HBASE_CONF_DIR=/opt/apps/hbase-2.0.5/conf
3.2 atlas—solr
solr存储atlas的图数据信息,比如血缘图
修改conf/atlas-application.properties 让atlas能找到solr的元数据
atlas.graph.index.search.solr.zookeeper-url=linux01:2181,linux02:2181,linux03:2181
atlas会将索引文件图数据信息存到solr
在solr中创建三个collection
创建vertex_index 三个分片 两个副本 点索引
sudo -i -u solr /opt/apps/solr-7.7.3/bin/solr create -c vertex_index -d /opt/apps/atlas-2.1.0/conf/solr -shards 3 -replicationFactor 2
创建edge_index 线索引
sudo -i -u solr /opt/apps/solr-7.7.3/bin/solr create -c edge_index -d /opt/apps/atlas-2.1.0/conf/solr -shards 3 -replicationFactor 2
创建fulltext_index 全局索引
sudo -i -u solr /opt/apps/solr-7.7.3/bin/solr create -c fulltext_index -d /opt/apps/atlas-2.1.0/conf/solr -shards 3 -replicationFactor 2
3.3.atlas—kafka
增量式数据同步需要kafka做为消息中间件
修改conf/atlas-application.properties
atlas.notification.embedded=false
#kafka的数据存储路径
atlas.kafka.data=/opt/apps/kafka_2.11-2.4.1/data
#kafka的zk连接地址
atlas.kafka.zookeeper.connect=linux01:2181,linux02:2181,linux03:2181/kafka
#kafka的连接地址
atlas.kafka.bootstrap.servers=linux01:9092,linux02:9092,linux03:9092
3.4atlas—server
修改conf/atlas-application.properties
改atlas的ip
atlas.rest.address=http://linux01:21000
取消每次开启初始化
atlas.server.run.setup.on.start=false
改zk地址
atlas.audit.hbase.zookeeper.quorum=linux01:2181,linux02:2181,linux03:2181
4、Atlas启动
启动 hadoop
启动zk
启动kafka
启动hbase
启动solr
在所有节点以solr用户执行:
sudo -i -u solr /opt/apps/solr-7.7.3/bin/solr start
上述都启动之后可以查看进程:
=============== linux01 ===============
27842 NameNode
30035 HRegionServer
28646 NodeManager
29115 QuorumPeerMain
29531 Kafka
29851 HMaster
28510 ResourceManager
30591 jar
=============== linux02 ===============
12321 HRegionServer
12165 Kafka
11766 QuorumPeerMain
12679 jar
11080 SecondaryNameNode
11003 DataNode
11435 NodeManager
=============== linux03 ===============
5008 NodeManager
4833 DataNode
5601 Kafka
5766 HRegionServer
5209 QuorumPeerMain
6124 jar
最后才可以启动atlas
/opt/apps/atlas-2.1.0/bin/atlas_start.py
启动会比较慢 会一直打印..........
直到看到Apache Atlas Server started!!! 则atlas成功启动
显示成功后还要再等待一会,web才可以访问
atlas启动的报错信息在:atlas/logs 以及 application.log中查看
停止为 atlas_stop.py
访问atlas的webUI:
linux01:21000
初始默认账户密码都是 admin
四、Atlas同步Hive源元数据演示
5.1 配置Hive Hook
前提:Atlas正常启动运行;已安装Hive
在源码编译之后会在distro/target中生成各种数据源的hook压缩包,选择hive-hook进行解压配置
修改/opt/apps/atlas-2.1.0/conf/atlas-application.properties
添加如下信息:
######### Hive Hook Configs ########
atlas.hook.hive.syschronous=false
atlas.hook.hive.numRetries=3
atlas.hook.hive.queueSize=10000
atlas.cluster.name=primary
修改/opt/apps/hive-3.1.2/conf/hive-site.xml 配置 Hive Hook
添加如下信息:
<property>
<name>hive.exec.post.hooks</name>
<value>org.apache.atlas.hive.hook.HiveHook</value>
</property>
安装 hive hook
解压:在/opt/apps/software下
tar -zxvf apache-atlas-2.1.0-hive-hook.tar.gz
进到apache-atlas-2.1.0-hive-hook.里面,将两个文件 hook 和 hook-bin 复制到atlas的安装目录下
scp -r ./* /opt/apps/atlas-2.1.0/
修改vi /opt/apps/hive-3.1.2/conf/hive-env.sh 使得hive可以找到atlas1的hook程序
export HIVE_AUX_JARS_PATH=/opt/apps/atlas-2.1.0/hook/hive
将/opt/apps/atlas-2.1.0/conf/atlas-application.properties 复制到 /opt/apps/hive-3.1.2/conf/
scp /opt/apps/atlas-2.1.0/conf/atlas-application.properties /opt/apps/hive-3.1.2/conf/
完成。。。
5.2 初始化同步hive元数据
执行hook-bin中的元数据导入脚本:
/opt/apps/atlas-2.1.0/hook-bin/import-hive.sh
输入用户名密码
看到:Hive Meta Data imported successfully!!! 则导入成功
之后边自动进行hive的元数据增量同步到Atlas中,不需要再执行初始化操作
六、源码分析
6.1 架构概述:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-vL79bING-1657696260115)(D:\ext.wangwentao5\Desktop\data\atlas\img\1653386940.jpg)]
hive\scoop\hbase。。。为不同的数据源,目前支持Hive、HBase、Scoop、Storm、Kafka数据源
kafka为消息中间件,用来增量同步源数据源数据到Atlas中
atlas 核心架构有三大模块:
ingest\export:导入导出元数据信息。导入指将其他的数据源的元数据导入到Atlas;导出指将atlas中的元数据导出 到hbase.atlas本身不存储数据。导入的时候是增量导入,依赖kafka做消息队列。hive-->atlas 会通过部署hive-hook,由hive 的 hook程序,将其元数据信息导入到atlas中
type system:系统分类模块,对元数据进行分类。包括源不同、同源的库、表、操作语句、路径等进行分类
graph engine:图引擎,主要是表与表之间的血缘依赖图、字段与字段之间的血缘依赖图。这部分数据是通过图数据库 JanusGraph经过转换成k-v形式,最后存储到Hbase中
Hbase为Atlas中的导出目标,存储元数据信息
solr类似与ElasticSearch,建索引,提供全局检索功能,称为索引库
提供了http协议和rest方式的api接口,方便做二次开发
各模块作用:
二、addons
安装扩展组件源代码,主要是Atlas接入各种Hadoop元数据数据源的桥接代码,对应Atlas架构图中的部分:
比如:hive-bridge
hive桥接扩展模块,通过bin目录下的import-hive.sh脚本导入hive元数据到Atlas系统,脚本调用了桥接代码类HiveMetaStoreBridg
三、authorization
Atlas鉴权模块,支持Simple鉴权和Ranger鉴权两种方式,这个模块的详细介绍说明和使用说明见官方文档:Apache Atlas – Data Governance and Metadata framework for Hadoop
五、client
客户端API代码
client-v2包:V2版本客户端API代码,客户端调用Atlas API接口时可以直接调用这里封装的API接口方法,减轻代码开发工作量
七、dashboardv3
Atlas管理台UI前端应用,对应架构图中的Admin UI
九、distro
atlas分布式部署相关的一些开发配置文件
pom.xml中提供了一些图数据库存储Hbase和图数据库索引检索组件solr的默认配置
bin目录下提供atlas基本安装部署的Python脚本文件,比如启动、停止atlas服务等
conf下提供Atlas配置文件
另外主要有:Atlas应用配置文件atlas-application.properties、Atlas环境变量配置文件atlas-env.sh、日志配置文件atlas-log4j.xml,鉴权策略配置文件atlas-simple-authz-policy.json、用户认证配置文件users-credentials.properties
main/assemblies目录下是打包相关的描述符配置文件
十九、webapp
Atlas Web应用模块
其中Atlas类:Atlas单机部署启动服务驱动类
6.2 程序启动过程
//Atlas.java: 单机部署启动服务驱动类
public static void main(String[] args) throws Exception {
//解析args参数
CommandLine cmd = parseArgs(args);
//加载这个模块的配置文件
PropertiesConfiguration buildConfiguration = new PropertiesConfiguration("atlas-buildinfo.properties");
String appPath = "webapp/target/atlas-webapp-" + getProjectVersion(buildConfiguration);
if (cmd.hasOption(APP_PATH)) {
appPath = cmd.getOptionValue(APP_PATH);
}
setApplicationHome();
//加载atlas-application.properties atlas主要的配置文件
Configuration configuration = ApplicationProperties.get();
final String enableTLSFlag = configuration.getString(SecurityProperties.TLS_ENABLED);
final String appHost = configuration.getString(SecurityProperties.BIND_ADDRESS, EmbeddedServer.ATLAS_DEFAULT_BIND_ADDRESS);
if (!isLocalAddress(InetAddress.getByName(appHost))) {
String msg =
"Failed to start Atlas server. Address " + appHost
+ " does not belong to this host. Correct configuration parameter: "
+ SecurityProperties.BIND_ADDRESS;
LOG.error(msg);
throw new IOException(msg);
}
//最终确定app端口port ,SSL协议默认为false不启动
final int appPort = getApplicationPort(cmd, enableTLSFlag, configuration);
System.setProperty(AtlasConstants.SYSTEM_PROPERTY_APP_PORT, String.valueOf(appPort));
final boolean enableTLS = isTLSEnabled(enableTLSFlag, appPort);
configuration.setProperty(SecurityProperties.TLS_ENABLED, String.valueOf(enableTLS));
showStartupInfo(buildConfiguration, enableTLS, appPort);
//根据参数创建 jetty服务 server,最后开启服务 server.start()
server = EmbeddedServer.newServer(appHost, appPort, appPath, enableTLS);
installLogBridge();
server.start();
}
6.3 源元数据导入到Atlas过程
以Hive元数据为例:
配置Hive Hook.然后启动初始化导入脚本:import-hive.sh
脚本中调用的主类:
${JAVA_BIN}" ${JAVA_PROPERTIES} -cp "${CP}" org.apache.atlas.hive.bridge.HiveMetaStoreBridge
主要调用过程:
importTables
└── importDatabases
└── importHiveMetadata
//导入所有表
private int importTables(AtlasEntity dbEntity, String databaseName, String tblName, final boolean failOnError) throws Exception {
int tablesImported = 0;
final List<String> tableNames;
if (StringUtils.isEmpty(tblName)) {
tableNames = hiveClient.getAllTables(databaseName);
} else {
tableNames = hiveClient.getTablesByPattern(databaseName, tblName);
}
if(!CollectionUtils.isEmpty(tableNames)) {
LOG.info("Found {} tables to import in database {}", tableNames.size(), databaseName);
try {
for (String tableName : tableNames) {
int imported = importTable(dbEntity, databaseName, tableName, failOnError);
//////////////////////////////////////////////////////////////////////////////////////
调用importTable(dbEntity, databaseName, tableName, failOnError);
在importTable中对表进行注册实例,也就是对元数据进行分类
registerInstances(createTableProcess);
///////////////////////////////////////////////////////////////////////////////
AtlasEntityWithExtInfo ret = null;
EntityMutationResponse response = atlasClientV2.createEntity(entity);
List<AtlasEntityHeader> createdEntities = response.getEntitiesByOperation(EntityMutations.EntityOperation.CREATE);
if (CollectionUtils.isNotEmpty(createdEntities)) {
for (AtlasEntityHeader createdEntity : createdEntities) {
if (ret == null) {
ret = atlasClientV2.getEntityByGuid(createdEntity.getGuid());
LOG.info("Created {} entity: name={}, guid={}", ret.getEntity().getTypeName(), ret.getEntity().getAttribute(ATTRIBUTE_QUALIFIED_NAME), ret.getEntity().getGuid());
////////////////////////////////////////////////////////////////////
client模块中有很多http接口:
new API_V2(TYPEDEF_BY_NAME, HttpMethod.GET, Response.Status.OK);
new API_V2(TYPEDEF_BY_GUID, HttpMethod.GET, Response.Status.OK);
new API_V2(TYPEDEFS_API, HttpMethod.GET, Response.Status.OK);
new API_V2(TYPEDEFS_API + "headers", HttpMethod.GET, Response.Status.OK);
new API_V2(TYPEDEFS_API, HttpMethod.POST, Response.Status.OK);
new API_V2(TYPEDEFS_API, HttpMethod.PUT, Response.Status.OK);
new API_V2(TYPEDEFS_API, HttpMethod.DELETE, Response.Status.NO_CONTENT);
new API_V2(TYPEDEF_BY_NAME, HttpMethod.DELETE, Response.Status.NO_CONTENT);
通过Http Post 的请求将库表数据更新至Atlas
增量同步数据:通过kafka
在HiveHook中调用run方法:
public void run(HookContext hookContext) throws Exception {
if (LOG.isDebugEnabled()) {
LOG.debug("==> HiveHook.run({})", hookContext.getOperationName());
}
try {
//处理Hook 上下文信息
HiveOperation oper = OPERATION_MAP.get(hookContext.getOperationName());
AtlasHiveHookContext context = new AtlasHiveHookContext(this, oper, hookContext, getKnownObjects(), isSkipTempTables());
BaseHiveEvent event = null;
//不同的时间,进行不同的事件处理
switch (oper) {
case CREATEDATABASE:
event = new CreateDatabase(context);
break;
case DROPDATABASE:
event = new DropDatabase(context);
七、分类传播 category propagate
分类具有传播性能
例如:a+b=》q
1、当父类打上标签时,则与其有血缘关系的子类也会自动打上该标签.
此时打上的标签会增加propagate Classifications:标识。
如:a-1 , b-1 则q-1
如:a-1 , b-2 则q-1,2
2、当父类的分类标签删除时,只有有一条血缘关系在维护,其父类的标签依旧在;当所有子类的分类标签都消除时,父类的标签会自动清除
如:a-1 , b-1 则q-1 清除a-1,则q-1;清除b-1,则q-1;清除a-1,b-1则q清除
如:a-1 , b-2 则q-1,2 清除a-1,则q-2;清除b-2,则q-1;清除a-1,b-2则q清除
3、当子类和父类的血缘关系被破坏时,父类的分类标签也会被清除
八、权限管理
8.1 登录认证
1、基于文件形式:
atlas-application.properties中配置
tlas.authentication.method.file=true
atlas.authentication.method.file.filename=sys:atlas.home/conf/users-credentials.properties
在users-credentials.properties中配置用户名密码就行
2、基于LDAP形式
atlas-application.properties中配置
atlas.authentication.method.ldap=true
atlas.authentication.method.ldap.type=ldap
然后设置这些属性
atlas.authentication.method.ldap.url=ldap://<Ldap server ip>:389
atlas.authentication.method.ldap.userDNpattern=uid={0},ou=users,dc=example,dc=com
atlas.authentication.method.ldap.groupSearchBase=dc=example,dc=com
atlas.authentication.method.ldap.groupSearchFilter=(member=cn={0},ou=users,dc=example,dc=com
atlas.authentication.method.ldap.groupRoleAttribute=cn
atlas.authentication.method.ldap.base.dn=dc=example,dc=com
atlas.authentication.method.ldap.bind.dn=cn=Manager,dc=example,dc=com
atlas.authentication.method.ldap.bind.password=<password>
atlas.authentication.method.ldap.referral=ignore
atlas.authentication.method.ldap.user.searchfilter=(uid={0})
atlas.authentication.method.ldap.default.role=ROLE_USER
8.2 权限管理
1、simple模式
atlas-application.properties中配置
atlas.authorizer.impl=simple
atlas.authorizer.simple.authz.policy.file=/atlas/conf/atlas-simple-authz-policy.json
通过加载权限管理配置文件来实现权限管理:atlas-simple-authz-policy.json
{
"roles": {
"ROLE_ADMIN": {
"adminPermissions": [
{
"privileges": [ ".*" ]
}
],
"typePermissions": [
{
"privileges": [ ".*" ],
"typeCategories": [ ".*" ],
"typeNames": [ ".*" ]
}
],
"entityPermissions": [
{
"privileges": [ ".*" ],
"entityTypes": [ ".*" ],
"entityIds": [ ".*" ],
"entityClassifications": [ ".*" ],
"labels": [ ".*" ],
"businessMetadata": [ ".*" ],
"attributes": [ ".*" ],
"classifications": [ ".*" ]
}
],
"relationshipPermissions": [
{
"privileges": [ ".*" ],
"relationshipTypes": [ ".*" ],
"end1EntityType": [ ".*" ],
"end1EntityId": [ ".*" ],
"end1EntityClassification": [ ".*" ],
"end2EntityType": [ ".*" ],
"end2EntityId": [ ".*" ],
"end2EntityClassification": [ ".*" ]
}
]
},
"DATA_SCIENTIST": {
"entityPermissions": [
{
"privileges": [ "entity-read", "entity-read-classification" ],
"entityTypes": [ ".*" ],
"entityIds": [ ".*" ],
"entityClassifications": [ ".*" ],
"labels": [ ".*" ],
"businessMetadata": [ ".*" ],
"attributes": [ ".*" ]
}
]
},
"DATA_STEWARD": {
"entityPermissions": [
{
"privileges": [ "entity-read", "entity-create", "entity-update", "entity-read-classification", "entity-add-classification", "entity-update-classification", "entity-remove-classification" ],
"entityTypes": [ ".*" ],
"entityIds": [ ".*" ],
"entityClassifications": [ ".*" ],
"labels": [ ".*" ],
"businessMetadata": [ ".*" ],
"attributes": [ ".*" ],
"classifications": [ ".*" ]
}
],
"relationshipPermissions": [
{
"privileges": [ "add-relationship", "update-relationship", "remove-relationship" ],
"relationshipTypes": [ ".*" ],
"end1EntityType": [ ".*" ],
"end1EntityId": [ ".*" ],
"end1EntityClassification": [ ".*" ],
"end2EntityType": [ ".*" ],
"end2EntityId": [ ".*" ],
"end2EntityClassification": [ ".*" ]
}
]
}
},
"userRoles": {
"admin": [ "ROLE_ADMIN" ],
"rangertagsync": [ "DATA_SCIENTIST" ]
},
"groupRoles": {
"ROLE_ADMIN": [ "ROLE_ADMIN" ],
"hadoop": [ "DATA_STEWARD" ],
"DATA_STEWARD": [ "DATA_STEWARD" ],
"RANGER_TAG_SYNC": [ "DATA_SCIENTIST" ]
}
}
2、ranger模式
此模式需要安装ranger,然后将ranger与atlas进行集成。
详细说明见官网:https://atlas.apache.org/index.html#/AtlasRangerAuthorizer
操作步骤参考博客:https://blog.youkuaiyun.com/weixin_41907245/article/details/125163861#:~:text=Atlas%E6%9C%AC%E8%BA%AB%E6%9C%89,.json%E6%96%87%E4%BB%B6%E4%B8%AD
需要配置:
ranger-atlas-audit.xml
ranger-atlas-security.xml
ranger-policymgr-ssl.xml
ranger-security.xml
Apache Ranger的Apache Atlas授权策略模型支持3个资源层次结构,以控制对以下各项的访问:类型、实体和管理操作。
具体的见官网
1049





