python uuid、hex study

本文深入探讨了Python中UUID库的作用及其生成唯一ID的四种主要方式:基于时间戳的UUID1、分布式环境下的UUID2、基于名字的MD5散列值的UUID3、基于随机数的UUID4和基于名字的SHA-1散列值的UUID5。通过实例解析,揭示了UUID的全局唯一性和在不同场景中的应用。

  由

  import uuid

  product[“SourceInfo"]["ProductID"] = uuid.uuid4().hex 引起的uuid

 

一、概述

  uuid 128位全局唯一标识符,通常由32位字符串表示。

  它可以保证时间和空间的唯一性,也成为GUID

  1、uuid1--基于时间戳

 

  2、uuid2--基于分布式环境DCE(python中没有这个函数)

 

  3、uuid3--基于名字的MD5散列值

 

  4、uuid4--基于随机数

    有一定的重复概率

 

  5、uuid5--基于名字的SHA-1散列值

--------------41316 --

参考链接:Python使用UUID库生成唯一ID

    http://blog.youkuaiyun.com/crazyhacking/article/details/38898721

D:\miniconda\python.exe D:/study/9.spark实训项目/spark-warehouse/dwd/to_dwd_event_log_detail.py 2025-06-16 处理日期: 2025-06-16 Warning: Ignoring non-Spark config property: hive.metastore.uris 25/06/18 16:59:05 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). zero : [] 25/06/18 16:59:15 WARN SessionState: METASTORE_FILTER_HOOK will be ignored, since hive.security.authorization.manager is set to instance of HiveAuthorizerFactory. addInPlace [] [] 25/06/18 16:59:19 WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped addInPlace [] [] addInPlace [] [] addInPlace [] [] addInPlace [] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [] [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] addInPlace [('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451), ('ww2nv', '山西省', '长治市', '屯留区', 36.34300378608867, 112.75053976371451)] [] +-------+-----+----------+--------+------------+----------+--------------+--------------+-----------------+------------------+-------+-------+---------+-----------------------------------------------------------------------------------------------------------------+--------------+----------+---------+-------------+------------+----------+--------+------+--------+ |account|appid|appversion|carrier |deviceid |devicetype|eventid |ip |latitude |longitude |nettype|osname |osversion|properties |releasechannel|resolution|sessionid|timestamp |newsessionid|dt |province|city |district| +-------+-----+----------+--------+------------+----------+--------------+--------------+-----------------+------------------+-------+-------+---------+-----------------------------------------------------------------------------------------------------------------+--------------+----------+---------+-------------+------------+----------+--------+------+--------+ |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |share |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{itemId -> 00114, pageId -> con0999, refUrl -> /schools/sch0765.html, url -> /contacts/con0999.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033830736|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |thumbUp |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{itemId -> 00540, pageId -> job0920, refUrl -> /courses/javaee/c061.html, url -> /jobs/job0920.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033839512|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |ColumnClick |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{columnId -> 012, pageId -> c099, refUrl -> /contacts/con0843.html, url -> /courses/flink/c099.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033852673|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |promotionClick|169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{pageId -> sha0821, promotionId -> 017, refUrl -> /jobs/job0614.html, url -> /shares/sha0821.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033855371|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |signIn |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{pageId -> c078, refUrl -> /contacts/con0596.html, url -> /courses/azkaban/c078.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033868068|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |submitOrder |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{} |小米游戏中心 |1024*768 |rbsvrtan |1750033872760|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |ColumnClick |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{columnId -> 002, pageId -> ite0605, refUrl -> /search/sea0015.html, url -> /items/ite0605.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033880177|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |addCart |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{pageId -> 00542, productId -> 00903, refUrl -> /search/sea0267.html, url -> /items/00542.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033894910|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |search |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{keyWord -> 一破 合 权 加 鞋, pageId -> s, productId -> 00489, refUrl -> /shares/sha0579.html, url -> /search/s?}|小米游戏中心 |1024*768 |rbsvrtan |1750033901165|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |addCart |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{pageId -> 00471, productId -> 00423, refUrl -> /contacts/con0972.html, url -> /items/00471.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033916431|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |pageView |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{pageId -> c076, refUrl -> /items/ite0728.html, url -> /courses/kylin/c076.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033923677|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |categoryClick |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{categoryId -> 012, pageId -> tea0463, refUrl -> /jobs/job0485.html, url -> /teachers/tea0463.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033927677|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | |mea.mch|app1 |7.6 |京东移动|UAQICKGRRUTM|MATE-10 |categoryClick |169.157.59.181|36.34300378608867|112.75053976371451|4G |android|7.5 |{categoryId -> 018, pageId -> c075, refUrl -> /courses/spark/c038.html, url -> /courses/kylin/c075.html} |小米游戏中心 |1024*768 |rbsvrtan |1750033930341|rbsvrtan-1 |2025-06-16|山西省 |长治市|屯留区 | +-------+-----+----------+--------+------------+----------+--------------+--------------+-----------------+------------------+-------+-------+---------+-----------------------------------------------------------------------------------------------------------------+--------------+----------+---------+-------------+------------+----------+--------+------+--------+ 通过API获取了 13 条新的地理位置信息 +-------+--------+------+--------+-----------------+------------------+ |geohash|province| city|district| lat| lng| +-------+--------+------+--------+-----------------+------------------+ | ww2nv| 山西省|长治市| 屯留区|36.34300378608867|112.75053976371451| | ww2nv| 山西省|长治市| 屯留区|36.34300378608867|112.75053976371451| | ww2nv| 山西省|长治市| 屯留区|36.34300378608867|112.75053976371451| | ww2nv| 山西省|长治市| 屯留区|36.34300378608867|112.75053976371451| | ww2nv| 山西省|长治市| 屯留区|36.34300378608867|112.75053976371451| +-------+--------+------+--------+-----------------+------------------+ only showing top 5 rows 目标表是否存在: True 合并后总数据量: 13 其中 2025-06-16 数据量: 13 创建临时视图: temp_event_log_detail_41100716 原子化覆盖目标表: dwd.event_log_detail Traceback (most recent call last): File "D:/study/9.spark实训项目/spark-warehouse/dwd/to_dwd_event_log_detail.py", line 141, in <module> spark.sql(f""" File "D:\miniconda\lib\site-packages\pyspark\sql\session.py", line 723, in sql return DataFrame(self._jsparkSession.sql(sqlQuery), self._wrapped) File "D:\miniconda\lib\site-packages\py4j\java_gateway.py", line 1304, in __call__ return_value = get_return_value( File "D:\miniconda\lib\site-packages\pyspark\sql\utils.py", line 117, in deco raise converted from None pyspark.sql.utils.AnalysisException: Cannot overwrite a path that is also being read from. Process finished with exit code 1根据这个错误修改以下代码import uuid from pyspark.sql import SparkSession from pyspark.sql.functions import udf, col, expr, lit, when from pyspark.sql.types import StringType, StructType, StructField, ArrayType, DoubleType import pygeohash as pgh from util import SparkUtils from util.date_utils import get_processing_date from util.location_util import get_location from pyspark import AccumulatorParam # 1. 获取处理日期q processing_date = get_processing_date() print(f"处理日期: {processing_date}") # 自定义累加器 class ListAccumulator(AccumulatorParam): def zero(self, value): print(f'zero : {value}') return value def addInPlace(self, value1, value2): print(f'addInPlace {value1} {value2}') value1.extend(value2) return value1 # 初始化SparkSession spark = SparkUtils.getSparkSession("写入dwd下的event_log_detail") # 2. 读取tmp.event_log_splited表并计算GeoHash event_log_df = spark.table("tmp.event_log_splited") \ .withColumn("geohash", expr("geo_hash(latitude,longitude)")) \ .filter(col("geohash").isNotNull() & (col("dt") == lit(processing_date)) & (col("account") == lit('mea.mch'))) # 过滤无效坐标 # 3. 读取dim.area_geo表并构建映射字典 area_geo_df = spark.table("dim.area_geo") \ .select("geohash", "province", "city", "district") \ .filter(col("geohash").isNotNull()) # 转换为本地字典(适用于数据量可内存容纳的情况) area_mapping = {row.geohash: (row.province, row.city, row.district) for row in area_geo_df.collect()} # 创建一个累加器 acc = spark.sparkContext.accumulator([], ListAccumulator()) # 4. 定义UDF实现字典查找 def lookup_area(geohash, lat, lng): if geohash not in area_mapping: # 查询高德api逆地理位置 location = get_location(lat, lng) # 在 lookup_area 函数中修改为: acc.add([(geohash, location[0], location[1], location[2], float(lat), float(lng))]) # 移除了 None 值,保持6个元素 return location return area_mapping.get(geohash, ("未知", "未知", "未知")) # 注册UDF lookup_udf = udf(lookup_area, StructType([ StructField("province", StringType()), StructField("city", StringType()), StructField("district", StringType()) ])) # 5. 关联行政区划信息并输出结果 result_df = event_log_df \ .withColumn("area_info", lookup_udf(col("geohash"), col("latitude"), col("longitude"))) \ .select( "*", # 保留所有原始字段 col("area_info.province").alias("province"), col("area_info.city").alias("city"), col("area_info.district").alias("district") ).drop("area_info", "geohash") # 输出结果(示例显示前20条) result_df.show(20, truncate=False) # 打印累加器结果 list2 = acc.value # 7. 创建更新数据的DataFrame updates_schema = StructType([ StructField("geohash", StringType()), StructField("province", StringType()), StructField("city", StringType()), StructField("district", StringType()), StructField("lat", DoubleType()), StructField("lng", DoubleType()) ]) updates = acc.value if updates: updates_df = spark.createDataFrame(updates, updates_schema) print(f"通过API获取了 {updates_df.count()} 条新的地理位置信息") updates_df.show(5) else: updates_df = spark.createDataFrame([], updates_schema) print("没有需要新增的地理位置信息") # 7. 检查目标表是否存在 def table_exists(spark, db_table): db, table = db_table.split('.') return spark.catalog._jcatalog.tableExists(db, table) target_table = "dwd.event_log_detail" table_exist = table_exists(spark, target_table) print(f"目标表是否存在: {table_exist}") # 8. 合并数据 if table_exist: # 获取表中非当前日期的数据 other_dates_df = spark.table(target_table) \ .filter(col("dt") != lit(processing_date)) # 合并数据:保留其他日期的数据 + 当前处理日期的更新数据 final_df = other_dates_df.unionByName(result_df) print(f"合并后总数据量: {final_df.count()}") print(f"其中 {processing_date} 数据量: {result_df.count()}") else: final_df = result_df # 9. 写入数据(使用临时视图确保原子性) if final_df.count() > 0: # 生成唯一的临时视图名称 temp_view_name = f"temp_event_log_detail_{uuid.uuid4().hex[:8]}" # 创建临时视图 final_df.createOrReplaceTempView(temp_view_name) print(f"创建临时视图: {temp_view_name}") print(f"原子化覆盖目标表: {target_table}") # 使用动态分区覆盖写入 spark.sql(f""" INSERT OVERWRITE TABLE {target_table} PARTITION(dt) SELECT * FROM {temp_view_name} """) print(f"成功更新数据,{processing_date}分区更新了{result_df.count()}条记录") # 删除临时视图 spark.catalog.dropTempView(temp_view_name) else: print("警告: 最终数据为空,跳过写入操作") # 验证写入结果 updated_count = spark.table("dwd.event_log_detail") \ .filter(col("dt") == lit(processing_date)) \ .count() print(f"验证: 表中 {processing_date} 分区现有数据量: {updated_count}")
06-19
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值