sqoop实现自定义函数
需求
- 因为手机号是比较的隐私的字段,实现手机号的非对称加密,以便日后发短信广告推广
技术实现
原理
- 修改sqoop生成的java代码修改,对其需要的字段进行加密,重新打包生成jar,在sqoop使用时导入该jar
具体流程
使用sqoop生成对应的java代码
sqoop codegen --connect {jdbc:url} --username {username} --password {password} --query {sql} -bindir {outputFile Path}
我以Mysql 的box_ad_click表为例,表结构如下,假定对要加密的字段是channel
mysql> desc box_ad_click; +---------------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +---------------+------------------+------+-----+---------+----------------+ | id | int(11) unsigned | NO | PRI | NULL | auto_increment | | channel | varchar(255) | YES | | NULL | | | media | varchar(255) | YES | | NULL | | | ad_place | varchar(255) | YES | | NULL | | | day | date | YES | | NULL | | | click | int(11) | YES | | NULL | | | click_uv | int(11) | YES | | NULL | | | date_time | timestamp | YES | | NULL | | | source | varchar(255) | YES | | NULL | | | activity_name | varchar(255) | YES | | NULL | | +---------------+------------------+------+-----+---------+----------------+
sqoop生成的命令是
sqoop codegen --connect jdbc:mysql://xxxx --username xxx --password xxx --query 'select channel from box_ad_click WHERE id>0 and $CONDITIONS' -bindir /home/appuser/wang/
生成的java文件在/home/appuser/wang/ 下
修改java文件
增加加密的方法
private static final String ALGORITHM = "RSA"; private static final String SIGNATURE_ALGORITHM = "xxx"; private static final String PUBLIC_KEY = "xxx"; private static final String PUBLIC_VALUE="xxx"; private static final String PUBLIC_KEY_PATH = "xxx"; private static String encryptByPublicKey(String data)throws Exception { byte[] keyBytes = Base64.decodeBase64(PUBLIC_VALUE); X509EncodedKeySpec x509KeySpec = new X509EncodedKeySpec(keyBytes); KeyFactory keyFactory = KeyFactory.getInstance(ALGORITHM); Key publicKey = keyFactory.generatePublic(x509KeySpec); Cipher cipher = Cipher.getInstance(keyFactory.getAlgorithm()); cipher.init(Cipher.ENCRYPT_MODE, publicKey); byte[] buff = cipher.doFinal(data.getBytes()); return Hex.encodeHexString(buff); }
对需要的字段进行加密
public void readFields(ResultSet __dbResults) throws SQLException { this.__cur_result_set = __dbResults; try { this.channel = encryptByPublicKey(JdbcWritableBridge.readString(1, __dbResults)); } catch (Exception e) { e.printStackTrace(); } } public void readFields0(ResultSet __dbResults) throws SQLException { try { this.channel =encryptByPublicKey(JdbcWritableBridge.readString(1, __dbResults)); } catch (Exception e) { e.printStackTrace(); } }
使用maven重新打包
所有的jar均使用provide方式,使用package命令打包,生成jar包
将jar上传到服务器,在使用sqoop命令时加入
sqoop import --connect jdbc:mysql://xxx --username xxx --password xxx --query 'select channel from box_ad_click where id>0 and $CONDITIONS' --target-dir /rsa_test --class-name QueryResult --jar-file wangdemo-1.0-SNAPSHOT.jar
结果如下
3fc743e6ee1850c5fbd0d0d627721f6290f82ebbdd0c46c25d9efaea4796b34cb99585664f1579f96bf8fbbaac73049d72038b0acc393430a8e3c866944e5a56e2f5f4dd026ed1642ba225f71a8a315037119da007ea809d188425a64285437a65d84b7d6d5fef6abba87ff0cbc922a1b7f6f8dc68cd200d305371eaf0ad0663 51665bc284cc0bf3e871a1c016d2e32a6e39548938b8f87cb36e62cf42a0d1ba3bba4a9a8bbda6585d228e84781f938f813a421fc88ec06ecc88c6e1e6fdcc40430a8870417b9961f953689bb6a847c07b092d17ecc216128edc2bcf20a692d83890f0416ec8f039999a3c13fdf6f8101d380144f4fa857bcb8dc08236dc82ce 1d9b4d7c973fa01f895e58321a69462038e2746ac1a168d9428c3cb7b7ebe8baedcaea47cd6c35386fe79d3292b157c9ea3eb65d2a92f4803d2eaade854eb293bab2ed2246ab27fccd4ef1bcf671cfb09bba7477072b5095d40f529987390b76551bd49754b4f694f158100fa21fae40539491e4470a68309159a8800716b0d7 2fe1b42c65cc73a6ab750df7ef71c65bf14509cf29223237f8567f900b6c62c30a14d37fd90b19d8f27d17eab34aa926ce2ec7fb69b5d480761b9d37e74b278e5611d9d849d83d2085de525a2ba7ec944b59dc29c9cf3889b3b054b76de707e5e227b2017d26226fb9a99725f123572db3277ab969b0d3afc1620480bede5c73 78d74aff61d5fa96556ca44691865c416b75910778eb30bb303463697b839365be05948860d072b6e9482fb35bd4228b80ebaad49dc5add71c2c97029916e35183f6f209167adc650a62001d1028e2efa418f4efc514bb1c15710097c0ba52f234a3f501d7ce93303e11590b6efaacd7c0d2a196fd0f1672b8e2c056a111c05c