目录
一、概览
在数据治理过程中我们会使用presto做数据查询引擎,通过presto统一查询Hive、达梦数据库中的数据,但是presto本身并不支持达梦数据库,需要根据相关文档进行二次开发插件,并把插件部署到对应的目录中使得Presto支持查询达梦数据库,并可以用presto执行达梦数据库和Hive数据库之间的关联
Presto版本0.289
github地址:https://github.com/prestodb/presto/tree/0.289
connector官方开发指南:https://prestodb.io/docs/current/develop/connectors.html
达梦数据库版本:V8
二、代码实现
2.1 拉取presto代码
从presto代码仓库中拉取所需要的代码,我用的是0.289版本
新建模块,取名presto-dm
maven中添加依赖,主要是添加达梦的驱动包,其余的依赖是编译和运行过程中需要的包
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-root</artifactId>
<version>0.289</version>
</parent>
<artifactId>presto-dm</artifactId>
<description>Presto - DM Connector</description>
<packaging>presto-plugin</packaging>
<properties>
<maven.compiler.source>8</maven.compiler.source>
<maven.compiler.target>8</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<air.main.basedir>${project.parent.basedir}</air.main.basedir>
</properties>
<dependencies>
<dependency>
<groupId>com.dameng</groupId>
<artifactId>DmJdbcDriver18</artifactId>
<version>8.1.3.62</version>
</dependency>
<dependency>
<groupId>javax.validation</groupId>
<artifactId>validation-api</artifactId>
</dependency>
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-base-jdbc</artifactId>
</dependency>
<dependency>
<groupId>com.facebook.airlift</groupId>
<artifactId>configuration</artifactId>
</dependency>
<dependency>
<groupId>com.google.inject</groupId>
<artifactId>guice</artifactId>
</dependency>
<dependency>
<groupId>com.facebook.airlift</groupId>
<artifactId>log</artifactId>
<version>0.215</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>javax.inject</groupId>
<artifactId>javax.inject</artifactId>
</dependency>
<!-- Presto SPI -->
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-spi</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-common</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.facebook.drift</groupId>
<artifactId>drift-api</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>io.airlift</groupId>
<artifactId>slice</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>io.airlift</groupId>
<artifactId>units</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<scope>provided</scope>
</dependency>
<!-- for testing -->
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-testng-services</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.assertj</groupId>
<artifactId>assertj-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.facebook.airlift</groupId>
<artifactId>testing</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.facebook.airlift</groupId>
<artifactId>json</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-main</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-tpch</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>io.airlift.tpch</groupId>
<artifactId>tpch</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.facebook.presto</groupId>
<artifactId>presto-tests</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.jetbrains</groupId>
<artifactId>annotations</artifactId>
<version>19.0.0</version>
<scope>test</scope>
</dependency>
</dependencies>
<profiles>
<profile>
<id>ci</id>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<excludes combine.self="override"/>
</configuration>
</plugin>
</plugins>
</build>
</profile>
<profile>
<id>default</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<!-- these tests take a very long time so only run them in the CI server -->
<excludes>
</excludes>
</configuration>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>
增加配置文件:com.facebook.presto.spi.Plugin
## 内容如下
com.facebook.presto.plugin.dm.DMPlugin
2.2 代码编写
新建包路径:com.facebook.presto.plugin.dm
主要类有DMClient、DMPlugin、DMClientModule、DMConfig
这里我参考了网上的一些写法和oracle的一些实现写法
2.2.1 DMConfig
这个类主要是配置达梦的一些配置项,这里的很多配置我都是参考的oracle的一些配置,并且删减了一些配置,主要是因为达梦的底层和语法很多都是和oracle是一致的,这个文件里的配置可以自行增删,比如达梦特有的一些配置项
public class DMConfig {
private int varcharMaxSize = 4000;
private int timestampDefaultPrecision = 6;
private int numberDefaultScale = 10;
private RoundingMode numberRoundingMode = RoundingMode.HALF_UP;
@Min(0)
@Max(38)
public int getNumberDefaultScale()
{
return numberDefaultScale;
}
@Config("dm.number.default-scale")
public DMConfig setNumberDefaultScale(int numberDefaultScale)
{
this.numberDefaultScale = numberDefaultScale;
return this;
}
@NotNull
public RoundingMode getNumberRoundingMode()
{
return numberRoundingMode;
}
@Config("dm.number.rounding-mode")
public DMConfig setNumberRoundingMode(RoundingMode numberRoundingMode)
{
this.numberRoundingMode = numberRoundingMode;
return this;
}
@Min(4000)
public int getVarcharMaxSize()
{
return varcharMaxSize;
}
@Config("dm.varchar.max-size")
public DMConfig setVarcharMaxSize(int varcharMaxSize)
{
this.varcharMaxSize = varcharMaxSize;
return this;
}
@Min(0)
@Max(9)
public int getTimestampDefaultPrecision()
{
return timestampDefaultPrecision;
}
@Config("dm.timestamp.precision")
public DMConfig setTimestampDefaultPrecision(int timestampDefaultPrecision)
{
this.timestampDefaultPrecision = timestampDefaultPrecision;
return this;
}
}
2.2.2 DMClientModule
这个类主要是绑定配置项和认证信息
configure中绑定DMClient类,并且绑定DMConfig类提供参数
connectionFactory这个方法主要是为了创建达梦连接器的一些参数,比如用户名密码,驱动,和连接的一些属性,这些属性都可以放到config中,这个config是不在代码中的,需要在服务器上配置
public class DMClientModule implements Module {
private static final Logger log = Logger.get(DMClientModule.class);
@Override
public void configure(Binder binder) {
binder.bind(JdbcClient.class).to(DMClient.class)
.in(Scopes.SINGLETON);
configBinder(binder).bindConfig(BaseJdbcConfig.class);
configBinder(binder).bindConfig(DMConfig.class);
}
@Provides
@Singleton
public static ConnectionFactory connectionFactory(BaseJdbcConfig config,DMConfig dmConfig)
throws SQLException
{
Properties connectionProperties = new Properties();
connectionProperties.setProperty("user",config.getConnectionUser());
connectionProperties.setProperty("password",config.getConnectionPassword());
return new DriverConnectionFactory(
new DmDriver(),
config.getConnectionUrl(),
Optional.empty(),
Optional.empty(),
connectionProperties);
}
}
2.2.3 DMPlugin
这个类主要是用来注册达梦插件用的,写法比较固定,并且将DMClientModule注册进去
public class DMPlugin extends JdbcPlugin
{
/**
* Oracle Plugin Constructor
*/
public DMPlugin() {
//name of the connector and the module implementation
super("dm", new DMClientModule());
}
}
2.2.4 DMClient
这个类属于创建达梦客户端的实现类,这里面主要是对一些BaseJdbcClient实现的一种重写,把达梦不一致的语法可以重写对应的类进行兼容,toPrestoType这个方法中主要是做达梦的字段类型和presto类型的转换映射,我这里并没有做太多的修改,可根据实际情况做调整
public class DMClient extends BaseJdbcClient {
private static final Logger log = Logger.get(DMClient.class);
private final int numberDefaultScale;
@Inject
public DMClient(
JdbcConnectorId connectorId,
BaseJdbcConfig config,
DMConfig dmConfig,
ConnectionFactory connectionFactory) throws SQLException {
// super(connectorId, config, "", DMClientModule.connectionFactory(config,dmConfig));
super(connectorId, config, "", connectionFactory);
log.info("dmConfig",dmConfig.toString());
log.info("config",config.toString());
this.numberDefaultScale = dmConfig.getNumberDefaultScale();
}
@Override
protected String getTableSchemaName(ResultSet resultSet) throws SQLException {
return super.getTableSchemaName(resultSet);
}
private String[] getTableTypes() {
return new String[]{"TABLE", "VIEW"};
}
@Override
protected ResultSet getTables(Connection connection, Optional<String> schemaName, Optional<String> tableName)
throws SQLException {
DatabaseMetaData metadata = connection.getMetaData();
String escape = metadata.getSearchStringEscape();
return metadata.getTables(
connection.getCatalog(),
escapeNamePattern(schemaName, Optional.of(escape)).orElse(null),
escapeNamePattern(tableName, Optional.of(escape)).orElse(null),
getTableTypes());
}
@Override
public PreparedStatement getPreparedStatement(ConnectorSession session, Connection connection, String sql)
throws SQLException {
PreparedStatement statement = connection.prepareStatement(sql);
statement.setFetchSize(1000);
return statement;
}
@Override
protected String generateTemporaryTableName() {
return "presto_tmp_" + System.nanoTime();
}
@Override
protected void renameTable(JdbcIdentity identity, String catalogName, SchemaTableName oldTable, SchemaTableName newTable) {
if (!oldTable.getSchemaName().equalsIgnoreCase(newTable.getSchemaName())) {
throw new PrestoException(NOT_SUPPORTED, "Table rename across schemas is not supported in dm");
}
String newTableName = newTable.getTableName().toUpperCase(ENGLISH);
String oldTableName = oldTable.getTableName().toUpperCase(ENGLISH);
String sql = format(
"ALTER TABLE %s RENAME TO %s",
quoted(catalogName, oldTable.getSchemaName(), oldTableName),
quoted(newTableName));
try (Connection connection = connectionFactory.openConnection(identity)) {
execute(connection, sql);
} catch (SQLException e) {
throw new PrestoException(JDBC_ERROR, e);
}
}
@Override
public Optional<ReadMapping> toPrestoType(ConnectorSession session, JdbcTypeHandle typeHandle) {
int columnSize = typeHandle.getColumnSize();
log.info("typeHandle==============================");
log.info(String.valueOf(typeHandle));
log.info(String.valueOf(typeHandle.getJdbcType()));
switch (typeHandle.getJdbcType()) {
case Types.CLOB:
return Optional.of(varcharReadMapping(createUnboundedVarcharType()));
case Types.SMALLINT:
return Optional.of(smallintReadMapping());
case Types.FLOAT:
case Types.DOUBLE:
return Optional.of(doubleReadMapping());
case Types.REAL:
return Optional.of(realReadMapping());
case Types.DECIMAL:
case Types.NUMERIC:
int precision = columnSize == 0 ? Decimals.MAX_PRECISION : columnSize;
int scale = typeHandle.getDecimalDigits();
if (scale == 0) {
return Optional.of(bigintReadMapping());
}
if (scale < 0 || scale > precision) {
return Optional.of(decimalReadMapping(createDecimalType(precision, numberDefaultScale)));
}
return Optional.of(decimalReadMapping(createDecimalType(precision, scale)));
case Types.LONGVARCHAR:
if (columnSize > VarcharType.MAX_LENGTH || columnSize == 0) {
return Optional.of(varcharReadMapping(createUnboundedVarcharType()));
}
return Optional.of(varcharReadMapping(createVarcharType(columnSize)));
case Types.VARCHAR:
return Optional.of(varcharReadMapping(createVarcharType(columnSize)));
}
return super.toPrestoType(session, typeHandle);
}
}
三、部署
3.1 打包
在maven中找到presto-dm并执行下面的package,从控制台中获取打包的路径。就会有presto-dm-0.289.zip这样一个文件;
3.2 上传到服务器
进入到presto-server的安装目录下,在plugin路径下新建dm文件夹,将上面打包成的压缩包上传到dm文件夹中并且解压到这里即可;
3.3 配置DM环境
在presto-server-0.289/etc/catalog下新建达梦配置文件dm.properties
注意:keywords=(desc,comment)一定要写,否则会报错
connector.name=dm
connection-url=jdbc:dm://IP:PORT/database?useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&useSSL=true&serverTimezone=GMT%2B8&&allowMultiQueries=true&keywords=(desc,comment)
connection-user=用户名
connection-password=密码
四、启动服务并验证
在bin目录下执行:./launcher start即可启动,访问:http://IP:8081/ui/ 可以打开presto的监控页面
使用dbeaver连接到presto服务,可以看到有个库叫dm,下面就是达梦库里面的库和表,可以执行查询等操作