java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.UnsupportedOperationException: Cannot inspect org.apache.hadoop.hive.serde2.io.DoubleWritable
1.这是我要写入的目的表,可以看到duration字段为string类型
CREATE EXTERNAL TABLE IF NOT EXISTS dwi_m.dwi_staypoint_msk_d (
mdn string
,grid_longi string
,grid_lati string
,grid_id string
,county_id string
,duration string
,grid_first_time string
,grid_last_time string
)
PARTITIONED BY (
day_id string
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS PARQUET
location '/daas/motl/dwi/dwi_staypoint_msk_d';
2.spark中的运行sql
SELECT mdn,longi AS grid_longi,lati AS grid_lati,grid_id,county_id,
(unix_timestamp(substring(start_time,0,14),'yyyyMMddHHmmss')-unix_timestamp(substring(start_time,16,29),'yyyyMMddHHmmss'))/60 AS duration,
SUBSTRING(start_time,16,29) AS grid_first_time,
SUBSTRING(start_time,0,14) AS grid_last_time
FROM tablename;
3.spark 代码
sqlContext
.sql(sql)
.write.mode(SaveMode.Overwrite)
.parquet(path)
写入后在hive中读取新表的数据报错
Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.UnsupportedOperationException: Cannot inspect org.apache.hadoop.hive.serde2.io.DoubleWritable
原因很明显:sql运行的结果中duration字段是double类型的,而目的表中是string类型的。
4.解决方法
1.修改目的表中duration字段的数据类型为double
2.在sql中使用cast(duration as string)将duration转为string类型
类似的类型错误可能都是这个原因造成的