XSD String Data Types

XSD String Data Types


String data types are used for values that contains character strings.


String Data Type

The string data type can contain characters, line feeds, carriage returns, and tab characters.

The following is an example of a string declaration in a schema:

<xs:element name="customer" type="xs:string"/>

An element in your document might look like this:

<customer>John Smith</customer>

Or it might look like this:

<customer>	John Smith	</customer>

Note: The XML processor will not modify the value if you use the string data type.


NormalizedString Data Type

The normalizedString data type is derived from the String data type.

The normalizedString data type also contains characters, but the XML processor will remove line feeds, carriage returns, and tab characters.

The following is an example of a normalizedString declaration in a schema:

<xs:element name="customer" type="xs:normalizedString"/>

An element in your document might look like this:

<customer>John Smith</customer>

Or it might look like this:

<customer>	John Smith	</customer>

Note: In the example above the XML processor will replace the tabs with spaces.


Token Data Type

The token data type is also derived from the String data type.

The token data type also contains characters, but the XML processor will remove line feeds, carriage returns, tabs, leading and trailing spaces, and multiple spaces.

The following is an example of a token declaration in a schema:

<xs:element name="customer" type="xs:token"/>

An element in your document might look like this:

<customer>John Smith</customer>

Or it might look like this:

<customer>	John Smith	</customer>

Note: In the example above the XML processor will remove the tabs.


String Data Types

Note that all of the data types below derive from the String data type (except for string itself)!

NameDescription
ENTITIES 
ENTITY 
IDA string that represents the ID attribute in XML (only used with schema attributes)
IDREFA string that represents the IDREF attribute in XML (only used with schema attributes)
IDREFS 
languageA string that contains a valid language id
NameA string that contains a valid XML name
NCName 
NMTOKENA string that represents the NMTOKEN attribute in XML (only used with schema attributes)
NMTOKENS 
normalizedStringA string that does not contain line feeds, carriage returns, or tabs
QName 
stringA string
tokenA string that does not contain line feeds, carriage returns, tabs, leading or trailing spaces, or multiple spaces


package org.example; import com.fasterxml.jackson.databind.JsonNode; import com.fasterxml.jackson.databind.ObjectMapper; import okhttp3.*; import org.apache.spark.sql.Dataset; import org.apache.spark.sql.Row; import org.apache.spark.sql.RowFactory; import org.apache.spark.sql.SparkSession; import org.apache.spark.sql.catalyst.analysis.NoSuchTableException; import org.apache.spark.sql.types.DataTypes; import org.apache.spark.sql.types.StructType; import java.io.IOException; import java.time.LocalDate; import java.time.format.DateTimeFormatter; import java.util.*; /** * Title: TakenToiceberg * Author: hyx * Package: org.example * Date: 2025/10/13 10:05 * Description: 从API拉取数据并写入Iceberg */ public class TakenToiceberg { private static final String LOGIN_URL = "http://172.16.3.40:8002/api/Login/Login"; private static final String READ_DATA_URL = "http://172.16.3.40:8002/api/ReadNowData_TEMP/Get"; private static final String ALARM_DATA_URL = "http://172.16.3.40:8002/api/DeviceAlarmNow/GetAlarmInfoByAlarmStatus"; private static final String USERNAME = "admin"; private static final String PASSWORD = "Rlxs@123456."; private static final String DEVICE_ADDR = "502412100404"; // 可配置为列表 private static final OkHttpClient httpClient = new OkHttpClient(); private static final ObjectMapper objectMapper = new ObjectMapper(); public static void main(String[] args) { SparkSession spark = SparkSession.builder() .appName("DeviceDataToIceberg") .master("local[*]") .config("spark.sql.catalog.iceberg_catalog", "org.apache.iceberg.spark.SparkCatalog") .config("spark.sql.catalog.iceberg_catalog.type", "hive") .config("spark.sql.catalog.iceberg_catalog.warehouse", "hdfs://10.62.167.57:8020/user/iceberg/warehouse/reli_collect.db") .config("spark.hadoop.dfs.nameservices", "bjrl") .config("spark.hadoop.dfs.ha.namenodes.bjrl", "nn1,nn2") .config("spark.hadoop.dfs.namenode.rpc-address.bjrl.nn1", "10.62.167.51:8020") .config("spark.hadoop.dfs.namenode.rpc-address.bjrl.nn2", "10.62.167.52:8020") .config("spark.hadoop.dfs.client.failover.proxy.provider.bjrl", "org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider") .getOrCreate(); try { // Step 1: 登录获取Token System.out.println("===== Step 1: 开始登录获取Token ====="); String token = loginAndGetToken(); System.out.println("Token获取成功,前20位:" + (token != null ? token.substring(0, 20) + "..." : "(Token为null)")); // Step 2: 拉取设备数据 System.out.println("===== Step 2: 开始拉取设备数据 ====="); JsonNode deviceData = fetchDeviceData(token, DEVICE_ADDR); if (deviceData != null && deviceData.has("data") && deviceData.get("data").has("data")) { List<JsonNode> dataList = new ArrayList<>(); deviceData.get("data").get("data").forEach(dataList::add); System.out.println("成功拉取 " + dataList.size() + " 条设备数据"); writeDeviceDataToIceberg(spark, dataList); } else { System.out.println("警告:API返回的设备数据为空或格式不匹配,跳过写入"); } // Step 3: 拉取告警数据(今日) System.out.println("===== Step 3: 开始拉取今日告警数据 ====="); String today = LocalDate.now().format(DateTimeFormatter.ofPattern("yyyy-MM-dd")); JsonNode alarmData = fetchAlarmData(token, DEVICE_ADDR, today, today); if (alarmData != null && alarmData.has("data") && alarmData.get("data").has("data")) { List<JsonNode> alarmList = new ArrayList<>(); alarmData.get("data").get("data").forEach(alarmList::add); System.out.println("成功拉取 " + alarmList.size() + " 条告警数据"); writeAlarmDataToIceberg(spark, alarmList); } else { System.out.println("警告:API返回的告警数据为空或格式不匹配,跳过写入"); } } catch (Exception e) { System.err.println("执行过程中发生异常:"); e.printStackTrace(); } finally { System.out.println("===== 正在停止Spark会话 ====="); spark.stop(); System.out.println("Spark会话已停止"); } } /** * 登录并获取Token */ private static String loginAndGetToken() throws IOException { MediaType JSON = MediaType.get("application/json; charset=utf-8"); String jsonBody = String.format( "{\"Guid\":\"null\",\"username\":\"%s\",\"password\":\"%s\",\"VerificationCode\":\"1234\"}", USERNAME, PASSWORD ); RequestBody body = RequestBody.create(JSON, jsonBody); Request request = new Request.Builder() .url(LOGIN_URL) .post(body) .addHeader("Content-Type", "application/json") .build(); try (Response response = httpClient.newCall(request).execute()) { if (!response.isSuccessful()) { throw new IOException("登录请求失败,响应码:" + response.code() + ",响应消息:" + response.message()); } JsonNode root = objectMapper.readTree(response.body().string()); // 防御性判断:确保success、datatoken字段存在且合法 if (root.has("success") && root.get("success").asBoolean() && root.has("data") && root.get("data").has("success") && root.get("data").get("success").asBoolean() && root.get("data").has("token")) { return root.get("data").get("token").asText(); } else { throw new RuntimeException("登录响应格式非法,响应内容:" + root); } } } /** * 拉取设备实时数据 */ private static JsonNode fetchDeviceData(String token, String deviceAddr) throws IOException { HttpUrl url = HttpUrl.parse(READ_DATA_URL) .newBuilder() .addQueryParameter("status", "0") .addQueryParameter("deptId", "0") .addQueryParameter("keyType", "deviceAddr") .addQueryParameter("key", deviceAddr) .build(); Request request = new Request.Builder() .url(url) .addHeader("Authorization", "Bearer " + token) .get() .build(); try (Response response = httpClient.newCall(request).execute()) { if (!response.isSuccessful()) { throw new IOException("拉取设备数据失败,响应码:" + response.code() + ",响应消息:" + response.message()); } return objectMapper.readTree(response.body().string()); } } /** * 拉取告警数据 */ private static JsonNode fetchAlarmData(String token, String deviceAddr, String begin, String end) throws IOException { HttpUrl url = HttpUrl.parse(ALARM_DATA_URL) .newBuilder() .addQueryParameter("pageNo", "1") .addQueryParameter("pageSize", "100") .addQueryParameter("deptId", "0") .addQueryParameter("keyType", "deviceAddr") .addQueryParameter("key", deviceAddr) .addQueryParameter("orderBy", "AlarmEndTime") .addQueryParameter("orderByType", "desc") .addQueryParameter("beginTime", begin) .addQueryParameter("endTime", end) .addQueryParameter("alarmStatus", "0") .build(); Request request = new Request.Builder() .url(url) .addHeader("Authorization", "Bearer " + token) .get() .build(); try (Response response = httpClient.newCall(request).execute()) { if (!response.isSuccessful()) { throw new IOException("拉取告警数据失败,响应码:" + response.code() + ",响应消息:" + response.message()); } return objectMapper.readTree(response.body().string()); } } /** * 将设备数据写入Iceberg */ private static void writeDeviceDataToIceberg(SparkSession spark, List<JsonNode> dataList) throws NoSuchTableException { List<Row> rows = new ArrayList<>(); for (JsonNode node : dataList) { // 每个字段都做空值防御性处理 String deviceAddr = node.has("DeviceAddr") && !node.get("DeviceAddr").isNull() ? node.get("DeviceAddr").asText() : null; String monitorName = node.has("MonitorName") && !node.get("MonitorName").isNull() ? node.get("MonitorName").asText() : null; String deviceTime = node.has("DeviceTime") && !node.get("DeviceTime").isNull() ? node.get("DeviceTime").asText() : null; Double temperature = node.has("Temperature") && !node.get("Temperature").isNull() ? node.get("Temperature").asDouble() : null; Double temperature2 = node.has("Temperature2") && !node.get("Temperature2").isNull() ? node.get("Temperature2").asDouble() : null; Integer liquidPosition = node.has("LiquidPosition") && !node.get("LiquidPosition").isNull() ? node.get("LiquidPosition").asInt() : null; Double voltage = node.has("Voltage") && !node.get("Voltage").isNull() ? node.get("Voltage").asDouble() : null; Integer csq = node.has("CSQ") && !node.get("CSQ").isNull() ? node.get("CSQ").asInt() : null; String imei = node.has("IMEI") && !node.get("IMEI").isNull() ? node.get("IMEI").asText() : null; String version = node.has("Version") && !node.get("Version").isNull() ? node.get("Version").asText() : null; String electricQuantity = node.has("ElectricQuantity") && !node.get("ElectricQuantity").isNull() ? node.get("ElectricQuantity").asText() : null; Boolean isOnline = node.has("IsOnline") && !node.get("IsOnline").isNull() ? node.get("IsOnline").asBoolean() : null; Boolean isAlarm = node.has("IsAlarm") && !node.get("IsAlarm").isNull() ? node.get("IsAlarm").asBoolean() : null; rows.add(RowFactory.create( deviceAddr, monitorName, deviceTime, temperature, temperature2, liquidPosition, voltage, csq, imei, version, electricQuantity, isOnline, isAlarm )); } StructType schema = new StructType() .add("device_addr", DataTypes.StringType) .add("monitor_name", DataTypes.StringType) .add("device_time", DataTypes.StringType) .add("temperature", DataTypes.DoubleType) .add("temperature2", DataTypes.DoubleType) .add("liquid_position", DataTypes.IntegerType) .add("voltage", DataTypes.DoubleType) .add("csq", DataTypes.IntegerType) .add("imei", DataTypes.StringType) .add("version", DataTypes.StringType) .add("electric_quantity", DataTypes.StringType) .add("is_online", DataTypes.BooleanType) .add("is_alarm", DataTypes.BooleanType); Dataset<Row> df = spark.createDataFrame(rows, schema); System.out.println("开始写入设备数据到Iceberg表:iceberg_catalog.reli_collect.ods1_d_cus_device_data"); df.writeTo("iceberg_catalog.reli_collect.ods1_d_cus_device_data").append(); System.out.println("设备数据写入完成,共 " + rows.size() + " 条"); } /** * 将告警数据写入Iceberg */ private static void writeAlarmDataToIceberg(SparkSession spark, List<JsonNode> alarmList) throws NoSuchTableException { List<Row> rows = new ArrayList<>(); for (JsonNode node : alarmList) { // 每个字段都做空值防御性处理 String deviceAddr = node.has("DeviceAddr") && !node.get("DeviceAddr").isNull() ? node.get("DeviceAddr").asText() : null; String deviceName = node.has("DeviceName") && !node.get("DeviceName").isNull() ? node.get("DeviceName").asText() : null; String monitorName = node.has("MonitorName") && !node.get("MonitorName").isNull() ? node.get("MonitorName").asText() : null; String alarmStartTime = node.has("AlarmStartTime") && !node.get("AlarmStartTime").isNull() ? node.get("AlarmStartTime").asText() : null; String alarmEndTime = node.has("AlarmEndTime") && !node.get("AlarmEndTime").isNull() ? node.get("AlarmEndTime").asText() : null; String alarmMsg = node.has("DeviceAlarmStr") && !node.get("DeviceAlarmStr").isNull() ? node.get("DeviceAlarmStr").asText() : null; String alarmMsg2 = node.has("DeviceAlarmStr2") && !node.get("DeviceAlarmStr2").isNull() ? node.get("DeviceAlarmStr2").asText() : null; Integer deviceState = node.has("DeviceState") && !node.get("DeviceState").isNull() ? node.get("DeviceState").asInt() : null; String readTime = node.has("ReadTime") && !node.get("ReadTime").isNull() ? node.get("ReadTime").asText() : null; rows.add(RowFactory.create( deviceAddr, deviceName, monitorName, alarmStartTime, alarmEndTime, alarmMsg, alarmMsg2, deviceState, readTime )); } StructType schema = new StructType() .add("device_addr", DataTypes.StringType) .add("device_name", DataTypes.StringType) .add("monitor_name", DataTypes.StringType) .add("alarm_start_time", DataTypes.StringType) .add("alarm_end_time", DataTypes.StringType) .add("alarm_msg", DataTypes.StringType) .add("alarm_msg2", DataTypes.StringType) .add("device_state", DataTypes.IntegerType) .add("read_time", DataTypes.StringType); Dataset<Row> df = spark.createDataFrame(rows, schema); System.out.println("开始写入告警数据到Iceberg表:iceberg_catalog.reli_collect.ods1_d_cus_device_alarms"); df.writeTo("iceberg_catalog.reli_collect.ods1_d_cus_device_alarms").append(); System.out.println("告警数据写入完成,共 " + rows.size() + " 条"); } } <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <!-- 1. 项目基础信息(替换为你的GroupId/ArtifactId) --> <groupId>org.example</groupId> <artifactId>thermal-data-to-iceberg</artifactId> <version>1.0-SNAPSHOT</version> <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <spark.version>3.5.0</spark.version> <iceberg.version>1.4.0</iceberg.version> <jackson.version>2.15.2</jackson.version> <okhttp.version>4.11.0</okhttp.version> </properties> <dependencies> <!-- Spark: provided --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${spark.version}</version> <scope>provided</scope> </dependency> <!-- Iceberg: 必须打包!不要 provided --> <dependency> <groupId>org.apache.iceberg</groupId> <artifactId>iceberg-spark-runtime-3.5_2.12</artifactId> <version>${iceberg.version}</version> <!-- 注意:没有 scope --> </dependency> <!-- 其他工具依赖 --> <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-databind</artifactId> <version>${jackson.version}</version> </dependency> <dependency> <groupId>com.squareup.okhttp3</groupId> <artifactId>okhttp</artifactId> <version>${okhttp.version}</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.4.1</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <createDependencyReducedPom>false</createDependencyReducedPom> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>org.example.TokenToiceberg</mainClass> <!-- 确保正确! --> </transformer> </transformers> </configuration> </execution> </executions> </plugin> <!-- compiler plugin ... --> </plugins> </build> </project>分析代码和依赖是否正确
最新发布
10-16
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值