大数据下union all的视图分页解决方案

本文探讨了在Oracle数据库中处理unionall视图进行大数据分页时遇到的问题及解决方案,包括直接查询基表而非视图和使用分析函数两种方法。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

最近的一个项目需要用到大数据分页,问题还是比较复杂的,下面分享一些自己的解决方法,与大家一起学习。

在对于union all的视图中,如果采用传统的采用rownum分页方式的时候,如
WHERE rownum < M)
WHERE linenum >=N
因为,Oracle执行到linenum >=N的时候,将不知所措,导致执行计划乱掉。如,假定bwm_users就是一个union all的视图。
代码如下:
select *
from mv_bmw_users_db1
union all
select  *
from mv_bmw_users_db2

如果我们在该视图上执行如下操作,可以看到
SQL> select * from
  2  (select rownum linenum,id,nick from
  3  (select id,nick from bmw_users  where nick ='test' order by id)
  4  where rownum < 50)
  5  where linenum >=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=20385 Card=49 Bytes=2401)
   1    0   VIEW (Cost=20385 Card=49 Bytes=2401)
   2    1     COUNT (STOPKEY)
   3    2       VIEW (Cost=20385 Card=1728633 Bytes=62230788)
   4    3         SORT (ORDER BY STOPKEY) (Cost=20385 Card=1728633 Bytes=62230788)
   5    4           VIEW OF 'BMW_USERS' (Cost=9278 Card=1728633 Bytes=62230788)
   6    5             UNION-ALL
   7    6               TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB1' (Cost=4639 Card=864090 Bytes=38884050)
   8    6               TABLE ACCESS (FULL) OF 'MV_BMW_USERS_DB2' (Cost=4639 Card=864543 Bytes=38904435)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      97298  consistent gets
      20770  physical reads

          0  redo size
        518  bytes sent via SQL*Net to client
        504  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed
         
一个非常简单的查询,在nick上是有索引的,而且表与索引都是分析过的,居然是全表扫描,耗费非常大的资源,这个时候,Oracle已经不能正确的判断使用索引了,所以错误的使用了全表,从统计信息也可以看到,该查询产生了大量的cr读与磁盘读。这些在前出塞项目里用到,这个时候,就是强行指定hint也不能改变oracle的执行计划,当然,这样是行不通的,我们必须找到一个行之有效的办法。

这样的问题怎么解决呢?有两个办法,一个是仍然使用union all语句在查询中,直接查询基表而不是视图。如以上语句改造为:
SQL> select * from
  2  (select rownum linenum,id,nick from
  3  (select * from
  4  (select id,nick from MV_BMW_USERS_DB1 where nick ='test'
  5  union all
  6  select id,nick from MV_BMW_USERS_DB1 where nick ='test')
  7  order by id)
  8  where rownum < 50)
  9  where linenum >=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=17 Card=2 Bytes=98)
   1    0   VIEW (Cost=17 Card=2 Bytes=98)
   2    1     COUNT (STOPKEY)
   3    2       VIEW (Cost=17 Card=2 Bytes=72)
   4    3         SORT (ORDER BY STOPKEY) (Cost=17 Card=2 Bytes=72)
   5    4           VIEW (Cost=8 Card=2 Bytes=72)
   6    5             UNION-ALL
   7    6               TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
   8    7                 INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
   9    6               TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
  10    9                 INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        553  bytes sent via SQL*Net to client
        504  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          2  rows processed

语句基本上是一样的,只是这次查询了基表,而不是视图,执行计划马上发生了改变,这次能使用了索引,而且成本有了很大的减少,可以看到cr读减少到只有8个块,而且磁盘读为0。

我们采用第二种方法,分析函数的办法,把语句改写为
SQL>select * from
 1 (select row_number() over(order by id) rn,id,nick from bmw_users where nick ='test')
 2 where rn <50 and rn >=1;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=13 Card=1 Bytes=49)
   1    0   VIEW (Cost=13 Card=1 Bytes=49)
   2    1     WINDOW (SORT PUSHED RANK) (Cost=13 Card=1 Bytes=45)
   3    2       VIEW OF 'BMW_USERS' (Cost=4 Card=1 Bytes=45)
   4    3         UNION-ALL (PARTITION)
   5    4           TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB1' (Cost=4 Card=1 Bytes=45)
   6    5             INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK1' (NON-UNIQUE) (Cost=3 Card=1)
   7    4           TABLE ACCESS (BY INDEX ROWID) OF 'MV_BMW_USERS_DB2' (Cost=4 Card=1 Bytes=45)
   8    7             INDEX (RANGE SCAN) OF 'IND_MV_BMW_USERS_NICK2' (NON-UNIQUE) (Cost=3 Card=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads

          0  redo size
        513  bytes sent via SQL*Net to client
        504  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed
可以看到,同样的功能,分析函数的方法是最简单的,同样也能正确的使用索引。

以上是一个简单的例子,我们再分析一个复杂的实际语句。
原始语句为:
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM
(SELECT * FROM
(SELECT T1.*, rownum as linenum
FROM
(SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,t.topic_distillate,
t.topic_vote,t.topic_status, t.topic_moved_id,TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id, t.topic_views,t.topic_title, t.topic_replies, t.topic_poster
FROM forum_topics t
WHERE t.forum_id = ?
AND t.topic_type < 2
AND t.topic_status <> 3
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC ) T1
WHERE rownum < ?)
WHERE linenum >=?) T2,
forum_posts p2,
bmw_users u1,bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id 
AND u2.user_id = p2.poster_id

因为其中bmw_users是union all的视图,所以,该查询也使用了基表的全表扫描。如果把它改写为union all的语句,也将是异常的复杂,如,该写成union all将是这个样子
select * from (
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS')
post_time FROM (
SELECT *FROM (
SELECT T1.*, rownum as linenum FROM(
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,t.topic_type,
t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t WHERE t.forum_id = :bind0 
AND t.topic_type < 2 AND t.topic_status <> 3  ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind1)
WHERE linenum >=:bind2
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id 
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick,
u1.user_id, u1.id as userid, u2.nick as user2, u2.user_id as id2,
u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownumas linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title,
t.topic_replies, t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind3 
AND t.topic_type < 2 AND t.topic_status <> 3 ORDER BY t.topic_type DESC,
t.topic_last_post_id DESC) T1
WHERE rownum < :bind4)
WHERE linenum >=:bind5
) T2,
forum_posts p2,
MV_BMW_USERS_DB1 u1,
MV_BMW_USERS_DB2 u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id 
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id,
u1.id as userid, u2.nick as user2, u2.user_id as id2, u2.id as userid2,
p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/ t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = :bind6  AND t.topic_type < 2 AND t.topic_status <> 3 
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind7)
WHERE linenum >=:bind8
) T2,
forum_posts p2,
MV_BMW_USERS_DB2 u1,
MV_BMW_USERS_DB1 u2
WHERE T2.topic_poster = u1.user_id
AND   T2.topic_last_post_id = p2.post_id 
AND u2.user_id = p2.poster_id
union all
SELECT /*+ ordered use_nl(u1,p2,u2)*/T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT T1.*, rownum as linenum FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/t.topic_id,
t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topicst WHERE t.forum_id = :bind9 
AND t.topic_type < 2 AND t.topic_status <> 3 
ORDER BY t.topic_type DESC, t.topic_last_post_id DESC) T1
WHERE rownum < :bind10)
WHERE linenum >=:bind11
) T2, forum_posts p2,
MV_BMW_USERS_DB2 u1,MV_BMW_USERS_DB2 u2 WHERE T2.topic_poster =
u1.user_id AND p2.post_id = T2.topic_last_post_id  AND u2.user_id = p2.poster_id
)
order by topic_type DESC,topic_last_post_id desc

 

但是,我们利用分析函数,将显得非常简单,而且正确的使用索引
SELECT /*+ ordered use_nl(u1,p2,u2)*/ T2.*,u1.nick, u1.user_id, u1.id as userid,
u2.nick as user2, u2.user_id as id2, u2.id as userid2, p2.post_username as post_username2,
TO_CHAR(p2.post_time,'YYYY-MM-DD HH24:MI:SS') post_time
FROM (
SELECT * FROM (
SELECT /*+ index (t IND_FORUM_TOPICS_FOR_ID)*/
row_number() over(order by t.topic_type DESC, t.topic_last_post_id DESC) rn,
t.topic_id,t.topic_type,t.topic_distillate, t.topic_vote,t.topic_status,t.topic_moved_id,
TO_CHAR(t.topic_time,'YYYY-MM-DD HH24:MI:SS')  topic_time,
t.topic_last_post_id,t.topic_views,t.topic_title, t.topic_replies,
t.topic_poster FROM forum_topics t
WHERE t.forum_id = ?  AND t.topic_type < 2 AND t.topic_status <> 3 
) T1
WHERE rn < ? and rn >= ?
) T2,
forum_posts p2,
bmw_users u1,
bmw_users u2
WHERE T2.topic_poster = u1.user_id
AND p2.post_id = T2.topic_last_post_id 
AND u2.user_id = p2.poster_id


### 使用 PageHelper 和 Union 查询实现分页 在实际开发过程中,当需要对 `UNION` 结果集进行分页处理时,可以通过 MyBatis 的 PageHelper 插件来简化操作。下面展示了一个具体的例子,说明如何结合 PageHelper 和 `UNION` 进行分页。 #### 创建 Mapper 接口定义联合查询方法 首先,在 Mapper 接口中声明一个返回类型为 `List<Map<String, Object>>` 或者特定实体列表的方法: ```java public interface CombinedQueryMapper { List<Record> selectUnionResults(@Param("param") String param); } ``` 此接口中的 `selectUnionResults` 方法用于执行带有 `UNION` 关键字的 SQL 语句[^2]。 #### 编写对应的 XML 映射文件配置 UNION 查询逻辑 接着,在相应的 `.xml` 文件里编写 SQL 脚本,这里假设有一个名为 `combined_query.xml` 的映射文件: ```xml <select id="selectUnionResults" parameterType="string" resultType="com.example.Record"> SELECT * FROM ( (SELECT col1, col2 FROM tableA WHERE condition = #{param}) UNION ALL (SELECT colX AS col1, colY AS col2 FROM tableB WHERE anotherCondition = #{param}) ) t ORDER BY someColumn LIMIT ${_offset}, ${_limit}; </select> ``` 注意这里的 `${}` 是为了兼容 MySQL 的语法特性而使用的占位符,它们会被直接替换为传入的实际数值。对于更安全的方式应该考虑使用 `<if>` 标签或者其他方式防止 SQL 注入攻击[^3]。 #### 在服务层应用 PageHelper 开始分页并调用 DAO 层方法 最后一步是在业务逻辑层(Service Layer),即 ServiceImpl 类中加入如下代码片段完成整个流程: ```java @Service public class ExampleServiceImpl implements ExampleService { @Autowired private CombinedQueryMapper combinedQueryMapper; @Override public PageInfo<Record> getCombinedRecords(String param, int currentPage, int pageSize) { // 启动分页功能 PageHelper.startPage(currentPage, pageSize); // 执行查询获取数据集合 List<Record> recordList = combinedQueryMapper.selectUnionResults(param); // 返回封装好的分页信息对象 return new PageInfo<>(recordList); } } ``` 上述代码展示了如何利用 PageHelper 来控制分页行为,并通过自定义的 Mapper 完成基于 `UNION` 的复杂查询。这样不仅能够有效地管理大量记录的数据检索过程,同时也提高了应用程序的整体性能和可维护性[^4]。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值