Flink 自定义UDTF

本文档展示了如何在Flink中实现一个Java和Scala版本的自定义函数(UDF),用于将Json格式的数据转换为Row类型。通过示例代码,解释了如何处理Json数组并提取字段,以及如何在Flink SQL中注册和使用这些函数。此外,还提供了完整的测试用例和依赖管理配置。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Flink自定义函数实现列传行,数据格式为Json数据:[{"key1":"value1","key2":"value2"...}]

代码

Java

@FunctionHint(output = @DataTypeHint("ROW<drugUniversalName string, specifications string, goodsUnit string, " +
        "location string, instruction string, consumption string, consumptionUnit string, frequency string, " +
        "prescriptionsAmount string, prescriptionsUnit string, goodsNo string>"))
public class JsonArrayParseUDF extends TableFunction<Row> {
    private static final Logger logger = Logger.getLogger(JsonArrayParseUDF.class);

    public void eval(String json) {
        if (StringUtils.isNullOrWhitespaceOnly(json)) {
            return;
        }
        String drugUniversalName = null;
        String specifications = null;
        String goodsUnit = null;
        String location = null;
        String instruction = null;
        String consumption = null;
        String consumptionUnit = null;
        String frequency = null;
        String prescriptionsAmount = null;
        String prescriptionsUnit = null;
        String goodsNo = null;

        try {
            Gson gson = new Gson();
            JsonArray jsonArray = gson.fromJson(json, JsonArray.class);

            for (JsonElement jsonElement : jsonArray) {
                JsonObject jsonObject = jsonElement.getAsJsonObject();

                JsonElement drugUniversalNameTmp = jsonObject.get("drugUniversalName");
                drugUniversalName = invalidate(drugUniversalNameTmp);

                JsonElement specificationsTmp = jsonObject.get("specifications");
                specifications = invalidate(specificationsTmp);

                JsonElement goodsUnitTmp = jsonObject.get("goodsUnit");
                goodsUnit = invalidate(goodsUnitTmp);

                JsonElement locationTmp = jsonObject.get("location");
                location = invalidate(locationTmp);

                JsonElement instructionTmp = jsonObject.get("instruction");
                instruction = invalidate(instructionTmp);

                JsonElement consumptionTmp = jsonObject.get("consumption");
                consumption = invalidate(consumptionTmp);

                JsonElement consumptionUnitTmp = jsonObject.get("consumptionUnit");
                consumptionUnit = invalidate(consumptionUnitTmp);

                JsonElement frequencyTmp = jsonObject.get("frequency");
                frequency = invalidate(frequencyTmp);

                JsonElement prescriptionsAmountTmp = jsonObject.get("prescriptionsAmount");
                prescriptionsAmount = invalidate(prescriptionsAmountTmp);

                JsonElement prescriptionsUnitTmp = jsonObject.get("prescriptionsUnit");
                prescriptionsUnit = invalidate(prescriptionsUnitTmp);

                JsonElement goodsNoTmp = jsonObject.get("goodsNo");
                goodsNo = invalidate(goodsNoTmp);

                Row row = Row.of(drugUniversalName, specifications, goodsUnit, location, instruction, consumption,
                        consumptionUnit, frequency, prescriptionsAmount, prescriptionsUnit, goodsNo);
                // System.out.println(row);
                collect(row);
            }

        } catch (Exception e) {
            logger.error("json parser failed :" + e.getMessage());
        }
    }

    public String invalidate(JsonElement jsonElement) {
        if (jsonElement != null) {
            return jsonElement.getAsString();
        } else {
            return "";
        }
    }

//    public static void main(String[] args) {
//        JsonArrayParseUDF parseUDF = new JsonArrayParseUDF();
//        String str = "[{\"drugUniversalName\":\"达格列净片\",\"specifications\":\"10mg*10片*3板\",\"goodsUnit\":\"盒\",\"location\":\"\",\"instruction\":\"口服\",\"consumption\":\"2.0\",\"consumptionUnit\":\"片\",\"frequency\":\"1日1次\",\"prescriptionsAmount\":\"5\",\"prescriptionsUnit\":\"盒\",\"countDosage\":\"\",\"description\":\"(安达唐)达格列净片10mg*10片*3板阿斯利康\",\"goodsNo\":\"1029990\"}]\n";
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

訾零

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值