完美解决Kettle导数据库产生的中文乱码

最新推荐文章于 2025-03-27 20:57:29 发布

转载最新推荐文章于 2025-03-27 20:57:29 发布 · 1.1w 阅读

文章标签：

本文介绍使用Kettle处理数据库间不同字符集导致的中文乱码问题。通过启用懒转换功能，保持数据原有字符集，避免乱码。此外，文章还提供了解决乱码的其他方法，包括设置数据库连接的字符集。

如果公司内一开始没有好好规划数据库建设，那么后期可能存在多种字符集的数据库实例。在做数据仓库或者来回导数据的时候，因字符集导致中文乱码问题困扰着不少人。网上有很多前辈们总结的解决中文乱码的方案，关于使用kettle如何解决也有一两篇谈到在建数据库连接时加characterEncoding来解决。我昨晚找到另外一种方式来跟大家分享：

经过对源码搜索”encoding“，找一句注释，发现其实解决方法很简单，

Java代码

/**
* Build the row using ResultSetMetaData rsmd
* @param rm The resultset metadata to inquire
* @param ignoreLength true if you want to ignore the length (workaround for MySQL bug/problem)
* @param lazyConversion true if lazy conversion needs to be enabled where possible
*/
private RowMetaInterface getRowInfo(ResultSetMetaData rm, boolean ignoreLength, boolean lazyConversion) throws KettleDatabaseException
{
if (rm==null) return null;
rowMeta = new RowMeta();
try
{
// TODO If we do lazy conversion, we need to find out about the encoding
//
int fieldNr = 1;
int nrcols=rm.getColumnCount();
for (int i=1;i<=nrcols;i++)
{
String name=new String(rm.getColumnName(i));
// Check the name, sometimes it's empty.
//
if (Const.isEmpty(name) || Const.onlySpaces(name))
{
name = "Field"+fieldNr;
fieldNr++;
}
ValueMetaInterface v = getValueFromSQLType(name, rm, i, ignoreLength, lazyConversion);
rowMeta.addValueMeta(v);
}
return rowMeta;
}
catch(SQLException ex)
{
throw new KettleDatabaseException("Error getting row information from database: ", ex);
}
}

	/**
	 * Build the row using ResultSetMetaData rsmd
     * @param rm The resultset metadata to inquire
     * @param ignoreLength true if you want to ignore the length (workaround for MySQL bug/problem)
     * @param lazyConversion true if lazy conversion needs to be enabled where possible
	 */
	private RowMetaInterface getRowInfo(ResultSetMetaData rm, boolean ignoreLength, boolean lazyConversion) throws KettleDatabaseException
	{
        if (rm==null) return null;
		
		rowMeta = new RowMeta();
		
		try
		{
			// TODO If we do lazy conversion, we need to find out about the encoding
			//
            int fieldNr = 1;
			int nrcols=rm.getColumnCount();	
			for (int i=1;i<=nrcols;i++)
			{
				String name=new String(rm.getColumnName(i));
                
                // Check the name, sometimes it's empty.
                //
                if (Const.isEmpty(name) || Const.onlySpaces(name))
                {
                    name = "Field"+fieldNr;
                    fieldNr++;
                }
                
				ValueMetaInterface v = getValueFromSQLType(name, rm, i, ignoreLength, lazyConversion);
				rowMeta.addValueMeta(v);			
			}
			return rowMeta;
		}
		catch(SQLException ex)
		{
			throw new KettleDatabaseException("Error getting row information from database: ", ex);
		}
	}

就是这样”If we do lazy conversion, we need to find out about the encoding“，直接勾选”允许延迟转换“即可：