On ROWNUM and limiting Results

本文探讨了使用ROWNUM进行TOP-N查询处理的方法,通过避免大规模排序提高查询效率。具体介绍了ROWNUM如何仅对所需N行数据进行排序,显著减少资源消耗。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

转摘自: http://www.oracle.com/technology/oramag/oracle/06-sep/o56asktom.html

Top-N Query Processing with ROWNUM

In a top-N query, you are generally interested in taking some complex query, sorting it, and then retrieving just the first N rows (the top N rows). ROWNUM has a top-N optimization that facilitates this type of query. You can use ROWNUM to avoid a massive sort of large sets. I'll discuss how it does this conceptually and then look at an example.

Suppose you have a query in this form:


select ...
from ...
where ...
order by columns;


Assume that this query returns a lot of data: thousands, hundreds of thousands, or more rows. However, you are interested only in the top N—say the top 10 or top 100. There are two ways to approach this:
1).Have the client application run that query and fetch just the first N rows.
2).Use that query as an inline view, and use ROWNUM to limit the results, as in SELECT * FROM (your_query_here) WHERE ROWNUM <= N.

The second approach is by far superior to the first, for two reasons. The lesser of the two reasons is that it requires less work by the client, because the database takes care of limiting the result set. The more important reason is the special processing the database can do to give you just the top N rows. Using the top-N query means that you have given the database extra information. You have told it, "I'm interested only in getting N rows; I'll never consider the rest." Now, that doesn't sound too earth-shattering until you think about sorting—how sorts work and what the server would need to do. Let's walk through the two approaches with a sample query:

select *
from t
order by unindexed_column;

Now, assume that T is a big table, with more than one million records, and each record is "fat"—say, 100 or more bytes. Also assume that UNINDEXED_COLUMN is, as its name implies, a column that is not indexed. And assume that you are interested in getting just the first 10 rows. Oracle Database would do the following:

1. Run a full-table scan on T.
2. Sort T by UNINDEXED_COLUMN. This is a full sort.
3. Presumably run out of sort area memory and need to swap temporary extents to disk.
4. Merge the temporary extents back to get the first 10 records when they are requested.
5. Clean up (release) the temporary extents as you are finished with them.

Now, that is a lot of I/O. Oracle Database has most likely copied the entire table into TEMP and written it out, just to get the first 10 rows.

Next, let's look at what Oracle Database can do conceptually with a top-N query:


select *
from
(select *
from t
order by unindexed_column)
where ROWNUM < :N;

In this case, Oracle Database will take these steps:

1. Run a full-table scan on T, as before (you cannot avoid this step).
2. In an array of :N elements (presumably in memory this time), sort only :N rows.


The first N rows will populate this array of rows in sorted order. When the N +1 row is fetched, it will be compared to the last row in the array. If it would go into slot N +1 in the array, it gets thrown out. Otherwise, it is added to this array and sorted and one of the existing rows is discarded. Your sort area holds N rows maximum, so instead of sorting one million rows, you sort N rows.

This seemingly small detail of using an array concept and sorting just N rows can lead to huge gains in performance and resource usage. It takes a lot less RAM to sort 10 rows than it does to sort one million rows (not to mention TEMP space usage).
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值