【Java】性能优化——使用Map实现多列表数据合并

本文链接：https://blog.youkuaiyun.com/weixin_50591390/article/details/145133181

前言

在实际开发中，经常会有需求是在多张表中查找具有相同字段的数据，然后将这多张表中的数据按照需求进行合并，许多开发者会使用for 循环嵌套。这种写法效率较低且不够优雅，本文提供一种使用Map合并数据的优化方法。

1.示例

假设我们有三个返回类：

@Data
public class LaborCostDetailRes {
    /**
     * 年月
     */
    private String date;
    /**
     * 人工成本
     */
    private BigDecimal laborCost;
}

@Data
public class OtherCostDetailRes {
    /**
     * 年月
     */
    private String date;
    /**
     * 其他成本
     */
    private BigDecimal otherCost;

}

@Data
public class CashFlowRes {
    /**
     * 年月
     */
    private String date;
    /**
     * 销售收入
     */
    private BigDecimal saleAmount;
    /**
     * 人工成本
     */
    private BigDecimal laborCost;
    /**
     * 其他成本
     */
    private BigDecimal otherCost;

}

而我们需要将LaborCostDetailRes的laborCost和OtherCostDetailRes的otherCost设置到CashFlowRes的对应字段中，通常我们可能这样写：

// 查询出所有的CashFlowRes集合
List<CashFlowRes> cashFlowRes = cashFlowMapper.list();
// 查询出所有LaborCostDetailRes集合
List<LaborCostDetailRes> laborCostDetailRes = staffSalaryMapper.listLaborCost();
// 查询出所有otherCostDetailRes集合
List<OtherCostDetailRes> otherCostDetailRes = costManagementMapper.listOtherCost();

for (CashFlowRes cashFlow : cashFlowRes) {
	String cashFlowDate = cashFlow.getDate();

	// 遍历laborCostDetailRes
	for (LaborCostDetailRes laborCost : laborCostDetailRes) {
		String laborCostDate = laborCost.getDate();
		if (cashFlowDate.equals(laborCostDate)) {
			cashFlow.setLaborCost(laborCost.getLaborCost());
			break; 
		}
	}

	// 遍历otherCostDetailRes
	for (OtherCostDetailRes otherCost : otherCostDetailRes) {
		String otherCostDate = otherCost.getDate();
		if (cashFlowDate.equals(otherCostDate)) {
			cashFlow.setOtherCost(otherCost.getOtherCost());
			break; 
		}
	}
}

时间复杂度分析

1.外层循环：

外层循环遍历 cashFlowRes 列表，假设 cashFlowRes 的长度为 n。

2.内层循环遍历 laborCostDetailRes：

对于 cashFlowRes 中的每个元素，内层循环遍历 laborCostDetailRes 列表，假设 laborCostDetailRes 的长度为 m。
在最坏情况下，内层循环需要遍历整个 laborCostDetailRes 列表才能找到匹配的日期。
因此，内层循环的时间复杂度为 O(m)。

3.内层循环遍历 otherCostDetailRes：

同样地，对于 cashFlowRes 中的每个元素，内层循环遍历 otherCostDetailRes 列表，假设 otherCostDetailRes 的长度为 p。
在最坏情况下，内层循环需要遍历整个 otherCostDetailRes 列表才能找到匹配的日期。
因此，内层循环的时间复杂度为 O( p)。

总时间复杂度

将上述时间复杂度结合起来，整个方法的时间复杂度为：

外层循环的时间复杂度为 O(n)。
每次外层循环中，有两个内层循环，分别的时间复杂度为 O(m) 和 O( p)。

因此，总的时间复杂度为：O(n * (m + p))

这种方法在数据量较大时可能会导致性能问题，因为时间复杂度较高。如果数据量较大，建议使用 Map 来优化性能。

// 查询出所有的CashFlowRes集合
List<CashFlowRes> cashFlowRes = cashFlowMapper.list();
// 查询出所有LaborCostDetailRes集合
List<LaborCostDetailRes> laborCostDetailRes = staffSalaryMapper.listLaborCost();
// 查询出所有otherCostDetailRes集合
List<OtherCostDetailRes> otherCostDetailRes = costManagementMapper.listOtherCost();

mergeCashFlow(laborCostDetailRes,otherCostDetailRes,cashFlowPOList);

public void mergeCashFlow(List<LaborCostDetailRes> laborCostDetailRes,
						  List<OtherCostDetailRes> otherCostDetailRes,
						  List<CashFlowPO> cashFlowPOList) {

	Map<String, BigDecimal> laborMap = createMap(laborCostDetailRes, LaborCostDetailRes::getDate, LaborCostDetailRes::getLaborCost);
	Map<String, BigDecimal> otherMap = createMap(otherCostDetailRes, OtherCostDetailRes::getDate, OtherCostDetailRes::getOtherCost);

	if (cashFlowResList != null) {
		for (CashFlowRes cashFlowRes : cashFlowResList) {
			if (cashFlowRes!= null && cashFlowRes.getDate() != null) {
				String yearMonth = cashFlowRes.getDate();
				cashFlowRes.setLaborCost(laborMap.getOrDefault(yearMonth, BigDecimal.ZERO));
				cashFlowRes.setOtherCost(otherMap.getOrDefault(yearMonth, BigDecimal.ZERO));
			}
		}
	}
}

/**
* 转换为一个日期到金额的 Map映射
* @param list
* @param dateExtractor
* @param amountExtractor
* @return
* @param <T>
*/
private <T> Map<String, BigDecimal> createMap(List<T> list, Function<T, String> dateExtractor, Function<T, BigDecimal> amountExtractor) {
	Map<String, BigDecimal> map = new HashMap<>();
	if (list != null) {
		for (T item : list) {
			if (item != null) {
				String date = dateExtractor.apply(item);
				BigDecimal amount = amountExtractor.apply(item);
				if (date != null) {
					map.put(date, amount);
				}
			}
		}
	}
	return map;
}

时间复杂度分析

1.createMap 方法：

createMap 方法的时间复杂度为 O(k)，其中 k 是列表的长度。
对于 laborCostDetailRes，时间复杂度为 O(m)，其中 m 是 laborCostDetailRes 的长度。
对于 otherCostDetailRes，时间复杂度为 O( p)，其中 p 是 otherCostDetailRes 的长度。

2.mergeCashFlow 方法：

遍历 cashFlowResList 的时间复杂度为 O(n)，其中 n 是 cashFlowResList 的长度。
对于每个 CashFlowRes 对象，从 laborMap 和 otherMap 中获取值的时间复杂度为 O(1)，因为 HashMap 的查找操作平均时间复杂度为 O(1)。

总时间复杂度

将上述时间复杂度结合起来，整个方法的时间复杂度为：

查询数据的时间复杂度可以忽略，假设数据已经加载到内存中。
createMap 方法的时间复杂度为 O(m + p)。
mergeCashFlow 方法的时间复杂度为 O(n)。

因此，总的时间复杂度为：O(m + p + n)

显然，使用 Map 进行数据合并可以显著提高性能，特别是在处理大量数据时，因为查找操作的时间复杂度为 O(1)。

2.总结

特性	嵌套 for 循环	使用 Map
时间复杂度	O(n * (m + p))	O(m + p + n)
性能	较差，数据量大时性能下降显著	较好，数据量大时性能优势明显
代码简洁性	简单直观，但代码冗长	代码简洁，易于维护和扩展
内存开销	低	高，需要额外的 Map 结构
扩展性	难以扩展，添加数据源时代码冗长	容易扩展，只需添加更多的 Map