Heap-Organized table 和 Index-Organized table 说明

官网的两个连接如下:

Tables and Table Clusters

http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/tablecls.htm#i20438

Indexes and Index-Organized Tables

http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/indexiot.htm#CBBFIFAB

2个文章讲的比较详细,在这里我将一些内容粘贴出来。

You can create a relational table with the following organizational characteristics:

1A heap-organized table does not store rows in any particular order. The CREATE TABLE statement creates a heap-organized table by default.

2An index-organized table orders rows according to the primary key values. For some applications, index-organized tables enhance performance and use disk space more efficiently. See "Overview of Index-Organized Tables".

Rowid Data Types

Every row stored in the database has an address. Oracle Database uses a ROWID data type to store the address (rowid) of every row in the database. Rowids fall into the following categories:

1Physical rowids store the addresses of rows in heap-organized tables, table clusters, and table and index partitions.

2Logical rowids store the addresses of rows in index-organized tables.

3Foreign rowids are identifiers in foreign tables, such as DB2 tables accessed through a gateway. They are not standard Oracle Database rowids.

A data type called the universal rowid, or UROWID, supports all kinds of rowids.

Use of Rowids

Oracle Database uses rowids internally for the construction of indexes. A B-tree index, which is the most common type, contains an ordered list of keys divided into ranges. Each key is associated with a rowid that points to the associated row's address for fast access. End users and application developers can also use rowids for several important functions:

1Rowids are the fastest means of accessing particular rows.

2Rowids provide the ability to see how a table is organized.

3Rowids are unique identifiers for rows in a given table.

You can also create tables with columns defined using the ROWID data type. For example, you can define an exception table with a column of data type ROWID to store the rowids of rows that violate integrity constraints. Columns defined using the ROWID data type behave like other table columns: values can be updated, and so on.

Overview of Table Clusters

A table cluster is a group of tables that share common columns and store related data in the same blocks. When tables are clustered, a single data block can contain rows from multiple tables. For example, a block can store rows from both the employees and departments tables rather than from only a single table.

The cluster key is the column or columns that the clustered tables have in common. For example, the employees and departments tables share the department_id column. You specify the cluster key when creating the table cluster and when creating every table added to the table cluster.

The cluster key value is the value of the cluster key columns for a particular set of rows. All data that contains the same cluster key value, such as department_id=20, is physically stored together. Each cluster key value is stored only once in the cluster and the cluster index, no matter how many rows of different tables contain the value.

For an analogy, suppose an HR manager has two book cases: one with boxes of employees folders and the other with boxes of departments folders. Users often ask for the folders for all employees in a particular department. To make retrieval easier, the manager rearranges all the boxes in a single book case. She divides the boxes by department ID. Thus, all folders for employees in department 20 and the folder for department 20 itself are in one box; the folders for employees in department 100 and the folder for department 100 are in a different box, and so on.

You can consider clustering tables when they are primarily queried (but not modified) and records from the tables are frequently queried together or joined. Because table clusters store related rows of different tables in the same data blocks, properly used table clusters offer the following benefits over nonclustered tables:

1Disk I/O is reduced for joins of clustered tables.

2Access time improves for joins of clustered tables.

3Less storage is required to store related table and index data because the cluster key value is not stored repeatedly for each row.

Typically, clustering tables is not appropriate in the following situations:

1The tables are frequently updated.

2The tables frequently require a full table scan.

3The tables require truncating.

Overview of Index-Organized Tables

An index-organized table is a table stored in a variation of a B-tree index structure.

In a heap-organized table, rows are inserted where they fit.

In an index-organized table, rows are stored in an index defined on the primary key for the table.

Each index entry in the B-tree also stores the non-key column values. Thus, the index is the data, and the data is the index. Applications manipulate index-organized tables just like heap-organized tables, using SQL statements.

For an analogy of an index-organized table, suppose a human resources manager has a book case of cardboard boxes. Each box is labeled with a number—1, 2, 3, 4, and so on—but the boxes do not sit on the shelves in sequential order. Instead, each box contains a pointer to the shelf location of the next box in the sequence.

Folders containing employee records are stored in each box. The folders are sorted by employee ID. Employee King has ID 100, which is the lowest ID, so his folder is at the bottom of box 1. The folder for employee 101 is on top of 100, 102 is on top of 101, and so on until box 1 is full. The next folder in the sequence is at the bottom of box 2.

In this analogy, ordering folders by employee ID makes it possible to search efficiently for folders without having to maintain a separate index. Suppose a user requests the records for employees 107, 120, and 122. Instead of searching an index in one step and retrieving the folders in a separate step, the manager can search the folders in sequential order and retrieve each folder as found.

Index-organized tables provide faster access to table rows by primary key or a valid prefix of the key. The presence of non-key columns of a row in the leaf block avoids an additional data block I/O.

For example, the salary of employee 100 is stored in the index row itself. Also, because rows are stored in primary key order, range access by the primary key or prefix involves minimal block I/Os. Another benefit is the avoidance of the space overhead of a separate primary key index.

Index-organized tables are useful when related pieces of data must be stored together or data must be physically stored in a specific order. This type of table is often used for information retrieval, spatial (see "Overview of Oracle Spatial"), and OLAP applications (see "OLAP").

Index-Organized Table Characteristics

The database system performs all operations on index-organized tables by manipulating the B-tree index structure. Table 3-4 summarizes the differences between index-organized tables and heap-organized tables.

Table 3-4 Comparison of Heap-Organized Tables with Index-Organized Tables

Heap-Organized Table

Index-Organized Table

The rowid uniquely identifies a row. Primary key constraint may optionally be defined.

Primary key uniquely identifies a row. Primary key constraint must be defined.

Physical rowid in ROWID pseudocolumn allows building secondary indexes.

Logical rowid in ROWID pseudocolumn allows building secondary indexes.

Individual rows may be accessed directly by rowid.

Access to individual rows may be achieved indirectly by primary key.

Sequential full table scan returns all rows in some order.

A full index scan or fast full index scan returns all rows in some order.

Can be stored in a table cluster with other tables.

Cannot be stored in a table cluster.

Can contain a column of the LONG data type and columns of LOB data types.

Can contain LOB columns but not LONG columns.

Can contain virtual columns (only relational heap tables are supported).

Cannot contain virtual columns.

Figure 3-3 illustrates the structure of an index-organized departments table. The leaf blocks contain the rows of the table, ordered sequentially by primary key. For example, the first value in the first leaf block shows a department ID of 20, department name of Marketing, manager ID of 201, and location ID of 1800.

Figure 3-3 Index-Organized Table


Description of "Figure 3-3 Index-Organized Table"

An index-organized table stores all data in the same structure and does not need to store the rowid. As shown in Figure 3-3, leaf block 1 in an index-organized table might contain entries as follows, ordered by primary key:

20,Marketing,201,1800

30,Purchasing,114,1700

Leaf block 2 in an index-organized table might contain entries as follows:

50,Shipping,121,1500

60,IT,103,1400

A scan of the index-organized table rows in primary key order reads the blocks in the following sequence:

1.Block 1

2.Block 2

To contrast data access in a heap-organized table to an index-organized table, suppose block 1 of a heap-organized departments table segment contains rows as follows:

50,Shipping,121,1500

20,Marketing,201,1800

Block 2 contains rows for the same table as follows:

30,Purchasing,114,1700

60,IT,103,1400

A B-tree index leaf block for this heap-organized table contains the following entries, where the first value is the primary key and the second is the rowid:

20,AAAPeXAAFAAAAAyAAD

30,AAAPeXAAFAAAAAyAAA

50,AAAPeXAAFAAAAAyAAC

60,AAAPeXAAFAAAAAyAAB

A scan of the table rows in primary key order reads the table segment blocks in the following sequence:

1. Block 1

2. Block 2

3. Block 1

4. Block 2

Thus, the number of block I/Os in this example is double the number in the index-organized example.

See Also:

· "Table Organization"

· "Introduction to Logical Storage Structures"

Index-Organized Tables with Row Overflow Area

When creating an index-organized table, you can specify a separate segment as a row overflow area. In index-organized tables, B-tree index entries can be large because they contain an entire row, so a separate segment to contain the entries is useful. In contrast, B-tree entries are usually small because they consist of the key and rowid.

If a row overflow area is specified, then the database can divide a row in an index-organized table into the following parts:

1The index entry

This part contains column values for all the primary key columns, a physical rowid that points to the overflow part of the row, and optionally a few of the non-key columns. This part is stored in the index segment.

2The overflow part

This part contains column values for the remaining non-key columns. This part is stored in the overflow storage area segment.

Secondary Indexes on Index-Organized Tables

A secondary index is an index on an index-organized table. In a sense, it is an index on an index. The secondary index is an independent schema object and is stored separately from the index-organized table.

As explained in "Rowid Data Types", Oracle Database uses row identifiers called logical rowids for index-organized tables. A logical rowid is a base64-encoded representation of the table primary key. The logical rowid length depends on the primary key length.

Rows in index leaf blocks can move within or between blocks because of insertions. Rows in index-organized tables do not migrate as heap-organized rows do (see "Chained and Migrated Rows"). Because rows in index-organized tables do not have permanent physical addresses, the database uses logical rowids based on primary key.

For example, assume that the departments table is index-organized. The location_id column stores the ID of each department. The table stores rows as follows, with the last value as the location ID:

10,Administration,200,1700

20,Marketing,201,1800

30,Purchasing,114,1700

40,Human Resources,203,2400

A secondary index on the location_id column might have index entries as follows, where the value following the comma is the logical rowid:

1700,*BAFAJqoCwR/+

1700,*BAFAJqoCwQv+

1800,*BAFAJqoCwRX+

2400,*BAFAJqoCwSn+

Secondary indexes provide fast and efficient access to index-organized tables using columns that are neither the primary key nor a prefix of the primary key. For example, a query of the names of departments whose ID is greater than 1700 could use the secondary index to speed data access.

Logical Rowids and Physical Guesses

Secondary indexes use the logical rowids to locate table rows. A logical rowid includes a physical guess, which is the physical rowid of the index entry when it was first made. Oracle Database can use physical guesses to probe directly into the leaf block of the index-organized table, bypassing the primary key search. When the physical location of a row changes, the logical rowid remains valid even if it contains a physical guess that is stale.

For a heap-organized table, access by a secondary index involves a scan of the secondary index and an additional I/O to fetch the data block containing the row. For index-organized tables, access by a secondary index varies, depending on the use and accuracy of physical guesses:

1. Without physical guesses, access involves two index scans: a scan of the secondary index followed by a scan of the primary key index.

2. With physical guesses, access depends on their accuracy:

1With accurate physical guesses, access involves a secondary index scan and an additional I/O to fetch the data block containing the row.

2With inaccurate physical guesses, access involves a secondary index scan and an I/O to fetch the wrong data block (as indicated by the guess), followed by an index unique scan of the index organized table by primary key value.

Bitmap Indexes on Index-Organized Tables

A secondary index on an index-organized table can be a bitmap index. As explained in "Bitmap Indexes", a bitmap index stores a bitmap for each index key.

When bitmap indexes exist on an index-organized table, all the bitmap indexes use a heap-organized mapping table. The mapping table stores the logical rowids of the index-organized table. Each mapping table row stores one logical rowid for the corresponding index-organized table row.

The database accesses a bitmap index using a search key. If the database finds the key, then the bitmap entry is converted to a physical rowid.

With heap-organized tables, the database uses the physical rowid to access the base table.

With index-organized tables, the database uses the physical rowid to access the mapping table, which in turn yields a logical rowid that the database uses to access the index-organized table. Figure 3-4 illustrates index access for a query of the departments_iot table.

Figure 3-4 Bitmap Index on Index-Organized Table



-------------------------------------------------------------------------------------------------------

Blog http://blog.youkuaiyun.com/tianlesoftware

Email: dvd.dba@gmail.com

DBA1 群:62697716(); DBA2 群:62697977() DBA3 群:62697850()

DBA 超级群:63306533(); DBA4 群: 83829929 DBA5群: 142216823

聊天 群:40132017 聊天2群:69087192

--加群需要在备注说明Oracle表空间和数据文件的关系,否则拒绝申请

<think> 我们正在讨论Oracle数据库中的聚集索引非聚集索引。根据引用内容,聚集索引非聚集索引在定义、存储方式使用场景上有显著区别。尽管引用内容主要基于通用数据库概念,但在Oracle中,这些概念有特定的实现术语。 在Oracle中,聚集索引(Clustered Index)实际上被称为“索引组织表(Index-Organized Table, IOT)”,而非聚集索引则对应普通索引(通常称为堆组织表的索引)。 让我们详细区分: 1. **聚集索引(索引组织表 - IOT)**: - 在IOT中,表的数据存储在索引结构中,即数据行本身按索引键的顺序存储。 - 类似于字典的正文,索引键(如拼音)直接对应数据位置。 - 一个表只能有一个聚集索引(因为数据只能按一种物理顺序存储)。 - 适用于基于主键的查询,特别是范围扫描,因为连续键值的数据物理上相邻。 - 插入更新操作可能较慢,因为需要维护数据物理顺序。 2. **非聚集索引(普通索引)**: - 非聚集索引是独立于表数据存储的结构,包含索引键值指向数据行的指针(ROWID)。 - 类似于字典的偏旁部首查字法,先找到地址,再根据地址找到数据。 - 一个表可以有多个非聚集索引。 - 查询时,通过索引找到ROWID,再通过ROWID访问表中的数据(如果查询的列不在索引中,则需要回表)。 - 插入更新操作相对较快,因为不影响数据物理顺序。 引用[3]中提到的区别在Oracle中同样适用,但术语稍有不同: - **存储方式**: - 聚集索引(IOT):索引与数据存储在一起,按索引键排序。 - 非聚集索引:索引单独存储,数据存储在堆中(默认)或IOT中(此时非聚集索引的指针指向IOT的主键)。 - **数量限制**: - 聚集索引:一个表只能有一个(因为数据只能按一种物理顺序存储)。 - 非聚集索引:可以有多个。 - **查询效率**: - 聚集索引对于主键查询范围查询效率高,因为数据物理连续,减少I/O。 - 非聚集索引对于等值查询也高效,但如果需要回表,则会有额外I/O。 - **写性能**: - 聚集索引的插入更新可能引起数据移动(分裂等),影响性能。 - 非聚集索引的插入更新只需要更新索引结构,相对较快,但多个索引会降低写性能。 在Oracle中创建聚集索引(即索引组织表)的示例: ```sql CREATE TABLE employees ( emp_id NUMBER PRIMARY KEY, name VARCHAR2(100), dept_id NUMBER ) ORGANIZATION INDEX; -- 指定为索引组织表 ``` 创建非聚集索引(普通索引)的示例: ```sql CREATE INDEX idx_employees_dept ON employees(dept_id); ``` **何时使用**: - 聚集索引(IOT)适用于: - 表主要通过主键访问。 - 数据按照主键顺序访问(范围查询)。 - 数据量不大且更新不频繁(避免分裂)。 - 非聚集索引适用于: - 需要提高非主键列的查询速度。 - 需要在一个表上建立多个索引。 - 表更新频繁,且不能承受IOT的维护开销。 注意:在Oracle中,默认的表是堆组织表(Heap-Organized Table),其上的索引都是非聚集索引(除了IOT上的二级索引)。IOT是一种特殊的表,其数据存储在索引结构中。 总结:Oracle中的聚集索引通过索引组织表实现,数据按索引键物理排序;非聚集索引是独立的索引结构,存储键值指向数据的指针。正确选择索引类型对性能优化至关重要[^3]。 §§相关问题§§: 1. Oracle中索引组织表(IOT)的优缺点是什么? 2. 在Oracle中如何为堆组织表创建非聚集索引? 3. Oracle的索引组织表普通表在存储上有何不同? </think>在Oracle数据库中,聚集索引非聚集索引(通常称为聚簇索引普通索引)的核心区别如下: ### 一、根本区别 1. **聚集索引(聚簇索引)** - **物理存储顺序**:表数据按索引键值**物理排序存储** 例如:`CREATE CLUSTER emp_dept_cluster (deptno NUMBER)` - **唯一性**:一个表**只能有一个**聚簇索引 - **结构**:索引叶节点直接存储**数据行**(类似字典正文)[^3] 2. **非聚集索引(普通索引)** - **逻辑指针**:索引与数据分离存储,叶节点存储**ROWID指针**指向数据位置 例如:`CREATE INDEX idx_emp_name ON emp(ename)` - **数量**:一个表可创建**多个非聚集索引** - **结构**:索引结构独立于数据存储(类似字典偏旁部首查字法)[^3] ```sql -- 聚簇索引创建示例 CREATE CLUSTER emp_cluster (deptno NUMBER); CREATE INDEX idx_cluster ON CLUSTER emp_cluster; -- 非聚集索引创建示例 CREATE INDEX idx_emp_salary ON employees(salary); ``` ### 二、性能对比 | **特性** | 聚集索引 | 非聚集索引 | |------------------|-----------------------------|---------------------------| | **查询速度** | 范围查询快(物理连续) | 等值查询快 | | **插入/更新成本** | 高(需移动数据物理位置) | 低(仅更新指针) | | **存储空间** | 索引与数据一体 | 额外存储索引结构 | | **适用场景** | 主键、频繁范围查询的列 | WHERE条件列、外键列 | ### 三、选择原则 1. **优先聚集索引** - 主键列(如用户ID) - 经常按范围查询的列(如订单日期) - 数据分组/排序频繁的列 2. **优先非聚集索引** - 外键列(如部门编号) - WHERE条件中的非主键列 - 值重复率低的列(高选择性) > **关键区别示例**: > 当执行`SELECT * FROM emp WHERE deptno BETWEEN 10 AND 30`时: > - 聚簇索引:直接读取连续物理块(3次I/O) > - 普通索引:先查索引再回表(可能10+次I/O)[^1][^3] ### 四、Oracle特殊机制 1. **索引组织表(IOT)** Oracle通过`ORGANIZATION INDEX`实现真聚集索引: ```sql CREATE TABLE iot_emp ( empno NUMBER PRIMARY KEY, ename VARCHAR2(50) ) ORGANIZATION INDEX; -- 数据按主键物理排序 ``` 2. **聚簇表** 多个表共享存储结构(减少连接开销): ```sql CREATE CLUSTER emp_dept_cluster (deptno NUMBER); CREATE TABLE dept (...) CLUSTER emp_dept_cluster(deptno); CREATE TABLE emp (...) CLUSTER emp_dept_cluster(deptno); ``` ### 五、使用注意事项 1. 聚集索引的**更新代价高**,频繁写入的列避免使用 2. 非聚集索引可能引发**回表查询**(Covering Index可优化) 3. 超过**30%数据量**的范围查询,聚集索引效率可能下降 > 经验法则:OLTP系统多用非聚集索引,数据仓库可合理使用聚集索引[^2][^3]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值