- 博客(34)
- 资源 (7)
- 收藏
- 关注
原创 spark写iceberg
MERGE INTO prod.db.target t -- a target tableUSING (SELECT ...) s -- the source updatesON t.id = s.id -- condition to find updates for target rowsWHEN ... -- updatesWHEN MATCHED AND s.op = 'delete' THEN .
2022-05-31 14:04:01
349
原创 sprk与es
<dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch-spark-20_2.11</artifactId> <version>6.7.2</version> </dependency>scala> val df = .
2022-04-24 18:11:58
169
原创 redis简要
redisredis是key-value内存数据库,线程安全,高并发,由于是内存数据库,存储能力有限,不宜存储过长的key,适用于高并发访问,公共数据的存储数据类型Redis支持五种数据类型:string(字符串),hash(哈希),list(列表),set(集合)及zset(sorted set:有序集合)。1)字符型redis 127.0.0.1:6379> SET runoobkey redisOKredis 127.0.0.1:6379> GET runoobk..
2021-03-05 16:57:15
99
原创 ES的查询语句
ES的查询语句ES的查询语句match语句前面提到match搜索会先对搜索词进行分词,对于最基本的match搜索来说,只要搜索词的分词集合中的一个或多个存在于文档中即可,例如,当我们搜索中国杭州,搜索词会先分词为中国和杭州,只要文档中包含搜索和杭州任意一个词,都会被搜索到term更加精确的匹配GET my_index/_search{“query”: {“term”: {“exact_value”: “Quick Foxes!”}}}多个查询terms 类似于 inGET /_
2021-01-28 11:36:34
1167
原创 hive的开窗函数
range between UNBOUNDEDPRECEDING andCURRENT ROW |UNBOUNDEDFOLLOWINGrows between 1 preceding and 2 followingrange表示排序字段实际值范围值,两个一样的值,值是一样的rows 表示排序值,两个值一样的也有先后顺序...
2021-01-28 10:50:15
120
原创 参数工具类
public class PropertiesUtil { private static final Logger logger = LoggerFactory.getLogger(PropertiesUtil.class); private static Properties prop = null; private static String properPath = ConstantsDefine.CONFIG_PATH + "*.properties"; //静.
2020-06-23 10:44:34
402
原创 flink基本配置
基础配置final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();env.enableCheckpointing(5000);// 设置checkpoint的周期env.getCheckpointConfig().setMinPauseBetweenCheckpoints(2000);// 确保检查点之间有至少2000 ms的间隔【checkpoint最小间隔】en
2020-06-23 10:42:27
697
原创 本地缓存
LoadingCache<Map<String, String>, String> xxxx= CacheBuilder.newBuilder() //设置并发级别为10,并发级别是指可以同时写缓存的线程数 .concurrencyLevel(10) //设置写缓存后10分钟没有写操作就重新加载(异步,不阻塞) ...
2019-09-12 14:16:44
145
原创 maven包引用并重命名解决包冲突(maven插件实现)
<plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> ...
2019-09-12 14:07:13
2404
原创 idea中仓库管理
添加maven配置文件以外的仓库:<repositories> <!-- add the elasticsearch repo --> <repository> <id>alimaven</id> <url>http://maven.aliyun.com/nexus/con...
2019-06-28 17:22:57
786
原创 spark读取es
val options = Map("pushdown" -> "true", "es.nodes" -> "10.116.106.*,10.116.106.*,10.116.106.*", "es.port" -> "9200") val esDf: Dataset[Row] = sql.read.format("org.elasticsearch.spark.sql")...
2019-06-25 10:17:12
1832
原创 spark_submit
#!/usr/bin/env bashspark-submit \--master yarn \--deploy-mode client \--driver-memory 5G \--num-executors $5 \--executor-cores $6 \--executor-memory $7 \--queue $8 \--class $3 \$4 \$1 \$2
2019-06-20 14:42:22
99
原创 正则
java测试正则:public static void main(String[] args) { GanXian ganXian = new GanXian(); String rex = "T1\\D"; Pattern.compile(rex);}正则匹配中文:"[\u4e00-\u9fa5]"正则截取示例:SELECT route_code...
2019-06-20 10:29:19
95
原创 本地批量缓存
public abstract class AtomicBatchService<INPUT, RESULT, OUT> implements Serializable { private static final long serialVersionUID = 2931723128262800986L; private static final Logger l...
2019-06-20 10:24:25
319
原创 log4j日志模板
log4j.rootLogger = INFO,root,stdoutlog4j.appender.stdout=org.apache.log4j.ConsoleAppenderlog4j.appender.stdout.layout=org.apache.log4j.PatternLayoutlog4j.appender.stdout.layout.conversionPattern=...
2019-06-20 09:55:58
1221
原创 redis相关工具类
public class RedisSentinelCluster { private static final Logger logger = LoggerFactory.getLogger(RedisSentinelCluster.class); private static JedisSentinelPool pool; private RedisSentin...
2019-06-20 09:50:59
91
原创 scala日期工具
object DateUtil { def strDateFormat(strDate: String, inputFormat : String, outputFormat: String): String = { val input = new SimpleDateFormat(inputFormat) val output = new SimpleDateFormat...
2019-06-20 09:44:11
617
转载 spark读写示例
/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file distributed with * this work for additional informati...
2019-04-09 13:58:45
225
原创 spark-sql与HIve
hive的安装下载hive并解压添加环境变量(增加如下变量)编辑hive-env.sh 加入HADOOP_HOME=/opt/apps/software/hadoop-2.7.3编辑hive-site.xml 加入<property> <name>javax.jdo.option.ConnectionDriverName</name>...
2019-04-08 16:20:16
148
原创 spark高可用安装
下载scala与spark并解压,添加至环境变量编辑spark-env.sh 添加如下变量export JAVA_HOME=/opt/apps/software/jdk1.8.0_201export SCALA_HOME=/opt/apps/software/scala-2.11.8export HADOOP_HOME=/opt/apps/software/hadoop-2.7.3e...
2019-04-08 16:11:06
264
原创 idea测试
导入测试依赖<!-- test start --><dependency> <groupId>org.test4j</groupId> <artifactId>test4j.testng</artifactId> <version>2.0.5</version><...
2019-04-08 10:52:29
361
原创 spark操作
创建sparksession支持hiveval spark = SparkSession.builder().appName(" PlaceCapacity").config("spark.some.config.option", "some-value") .config("spark.sql.hive.filesourcePartitionFileCacheSize", 500 *...
2019-04-04 11:49:39
272
原创 mysql权限
mysql增加用户 (%代表可以登录ip)CREATE USER 'hive'@'%' IDENTIFIED BY 'hive';权限赋予:grant select,delete,update,create,drop on test_hive.* to 'hive'@'%' identified by '123456';赋予所有权 (hive并与database,*表与表)g...
2019-03-27 19:23:23
184
原创 hive安装
下载并解压hive在env中添加HADOOP_HOME=/opt/apps/software/hadoop-2.7.3修改模板hive-site并添加如下<property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdb...
2019-03-27 15:51:12
94
原创 centos7云安装mysql
下载源:wget http://repo.mysql.com/mysql80-community-release-el7.rpm安装 rpm -ivh mysql80-community-release-el7.rpm安装mysqlyum install mysql安装mysql-server yum -y install mysql-server安装 yum -y ins...
2019-03-27 14:29:54
55
原创 centos7的mysql安装
下载安装mysqlhttps://dev.mysql.com/downloads/mysql/解压mysql到指定文件夹 tar xvf 文件 文件夹卸载系统自带的 mariadb-librpm -qa|grep -i mariadb##mariadb-libs-5.5.50-1.el7_2.x86_64 rpm -e mariadb-libs-5.5.50-1.el7_2...
2019-03-27 14:16:52
78
原创 Hadoop高可用集群搭建
1. 配置免密登录 1.ssh-keygen 生成秘钥(文件在目录/root/.ssh下) 2.将公钥追加到cat authorized_keys >> authorized_keys 3.将需要相互登录的机器的公钥发送给各设备,并追加cat authorized_keys >> authorized_keys2.集群规划 ...
2019-03-26 16:02:54
157
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人