Heap-Organized table 和 Index-Organized table 说明

最新推荐文章于 2024-03-24 23:59:33 发布

最新推荐文章于 2024-03-24 23:59:33 发布 · 238 阅读

文章标签：

#Oracle #Access #SQL Server #OS #DB2

官网的两个连接如下：

Tables and Table Clusters

http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/tablecls.htm#i20438

Indexes and Index-Organized Tables

http://download.oracle.com/docs/cd/E11882_01/server.112/e16508/indexiot.htm#CBBFIFAB

这2个文章讲的比较详细，在这里我将一些内容粘贴出来。

You can create a relational table with the following organizational characteristics:

（1）A heap-organized table does not store rows in any particular order. The CREATE TABLE statement creates a heap-organized table by default.

（2）An index-organized table orders rows according to the primary key values. For some applications, index-organized tables enhance performance and use disk space more efficiently. See "Overview of Index-Organized Tables".

Rowid Data Types

Every row stored in the database has an address. Oracle Database uses a ROWID data type to store the address (rowid) of every row in the database. Rowids fall into the following categories:

（1）Physical rowids store the addresses of rows in heap-organized tables, table clusters, and table and index partitions.

（2）Logical rowids store the addresses of rows in index-organized tables.

（3）Foreign rowids are identifiers in foreign tables, such as DB2 tables accessed through a gateway. They are not standard Oracle Database rowids.

A data type called the universal rowid, or UROWID, supports all kinds of rowids.

Use of Rowids

Oracle Database uses rowids internally for the construction of indexes. A B-tree index, which is the most common type, contains an ordered list of keys divided into ranges. Each key is associated with a rowid that points to the associated row's address for fast access. End users and application developers can also use rowids for several important functions:

（1）Rowids are the fastest means of accessing particular rows.

（2）Rowids provide the ability to see how a table is organized.

（3）Rowids are unique identifiers for rows in a given table.

You can also create tables with columns defined using the ROWID data type. For example, you can define an exception table with a column of data type ROWID to store the rowids of rows that violate integrity constraints. Columns defined using the ROWID data type behave like other table columns: values can be updated, and so on.

Overview of Table Clusters

A table cluster is a group of tables that share common columns and store related data in the same blocks. When tables are clustered, a single data block can contain rows from multiple tables. For example, a block can store rows from both the employees and departments tables rather than from only a single table.

The cluster key is the column or columns that the clustered tables have in common. For example, the employees and departments tables share the department_id column. You specify the cluster key when creating the table cluster and when creating every table added to the table cluster.

The cluster key value is the value of the cluster key columns for a particular set of rows. All data that contains the same cluster key value, such as department_id=20, is physically stored together. Each cluster key value is stored only once in the cluster and the cluster index, no matter how many rows of different tables contain the value.

For an analogy, suppose an HR manager has two book cases: one with boxes of employees folders and the other with boxes of departments folders. Users often ask for the folders for all employees in a particular department. To make retrieval easier, the manager rearranges all the boxes in a single book case. She divides the boxes by department ID. Thus, all folders for employees in department 20 and the folder for department 20 itself are in one box; the folders for employees in department 100 and the folder for department 100 are in a different box, and so on.

You can consider clustering tables when they are primarily queried (but not modified) and records from the tables are frequently queried together or joined. Because table clusters store related rows of different tables in the same data blocks, properly used table clusters offer the following benefits over nonclustered tables:

（1）Disk I/O is reduced for joins of clustered tables.

（2）Access time improves for joins of clustered tables.

（3）Less storage is required to store related table and index data because the cluster key value is not stored repeatedly for each row.

Typically, clustering tables is not appropriate in the following situations:

（1）The tables are frequently updated.

（2）The tables frequently require a full table scan.

（3）The tables require truncating.

Overview of Index-Organized Tables

An index-organized table is a table stored in a variation of a B-tree index structure.

In a heap-organized table, rows are inserted where they fit.

In an index-organized table, rows are stored in an index defined on the primary key for the table.

Each index entry in the B-tree also stores the non-key column values. Thus, the index is the data, and the data is the index. Applications manipulate index-organized tables just like heap-organized tables, using SQL statements.

For an analogy of an index-organized table, suppose a human resources manager has a book case of cardboard boxes. Each box is labeled with a number—1, 2, 3, 4, and so on—but the boxes do not sit on the shelves in sequential order. Instead, each box contains a pointer to the shelf location of the next box in the sequence.

Folders containing employee records are stored in each box. The folders are sorted by employee ID. Employee King has ID 100, which is the lowest ID, so his folder is at the bottom of box 1. The folder for employee 101 is on top of 100, 102 is on top of 101, and so on until box 1 is full. The next folder in the sequence is at the bottom of box 2.

In this analogy, ordering folders by employee ID makes it possible to search efficiently for folders without having to maintain a separate index. Suppose a user requests the records for employees 107, 120, and 122. Instead of searching an index in one step and retrieving the folders in a separate step, the manager can search the folders in sequential order and retrieve each folder as found.

Index-organized tables provide faster access to table rows by primary key or a valid prefix of the key. The presence of non-key columns of a row in the leaf block avoids an additional data block I/O.

For example, the salary of employee 100 is stored in the index row itself. Also, because rows are stored in primary key order, range access by the primary key or prefix involves minimal block I/Os. Another benefit is the avoidance of the space overhead of a separate primary key index.

Index-organized tables are useful when related pieces of data must be stored together or data must be physically stored in a specific order. This type of table is often used for information retrieval, spatial (see "Overview of Oracle Spatial"), and OLAP applications (see "OLAP").

Index-Organized Table Characteristics

The database system performs all operations on index-organized tables by manipulating the B-tree index structure. Table 3-4 summarizes the differences between index-organized tables and heap-organized tables.

Table 3-4 Comparison of Heap-Organized Tables with Index-Organized Tables

Heap-Organized Table	Index-Organized Table
The rowid uniquely identifies a row. Primary key constraint may optionally be defined.	Primary key uniquely identifies a row. Primary key constraint must be defined.
Physical rowid in ROWID pseudocolumn allows building secondary indexes.	Logical rowid in ROWID pseudocolumn allows building secondary indexes.
Individual rows may be accessed directly by rowid.	Access to individual rows may be achieved indirectly by primary key.
Sequential full table scan returns all rows in some order.	A full index scan or fast full index scan returns all rows in some order.
Can be stored in a table cluster with other tables.	Cannot be stored in a table cluster.
Can contain a column of the LONG data type and columns of LOB data types.	Can contain LOB columns but not LONG columns.
Can contain virtual columns (only relational heap tables are supported).	Cannot contain virtual columns.

Figure 3-3 illustrates the structure of an index-organized departments table. The leaf blocks contain the rows of the table, ordered sequentially by primary key. For example, the first value in the first leaf block shows a department ID of 20, department name of Marketing, manager ID of 201, and location ID of 1800.

Figure 3-3 Index-Organized Table

Description of "Figure 3-3 Index-Organized Table"

An index-organized table stores all data in the same structure and does not need to store the rowid. As shown in Figure 3-3, leaf block 1 in an index-organized table might contain entries as follows, ordered by primary key:

20,Marketing,201,1800

30,Purchasing,114,1700

Leaf block 2 in an index-organized table might contain entries as follows:

50,Shipping,121,1500

60,IT,103,1400

A scan of the index-organized table rows in primary key order reads the blocks in the following sequence:

1.Block 1

2.Block 2

To contrast data access in a heap-organized table to an index-organized table, suppose block 1 of a heap-organized departments table segment contains rows as follows:

50,Shipping,121,1500

20,Marketing,201,1800

Block 2 contains rows for the same table as follows:

30,Purchasing,114,1700

60,IT,103,1400

A B-tree index leaf block for this heap-organized table contains the following entries, where the first value is the primary key and the second is the rowid:

20,AAAPeXAAFAAAAAyAAD

30,AAAPeXAAFAAAAAyAAA

50,AAAPeXAAFAAAAAyAAC

60,AAAPeXAAFAAAAAyAAB

A scan of the table rows in primary key order reads the table segment blocks in the following sequence:

1. Block 1

2. Block 2

3. Block 1

4. Block 2

Thus, the number of block I/Os in this example is double the number in the index-organized example.

See Also:

· "Table Organization"

· "Introduction to Logical Storage Structures"

Index-Organized Tables with Row Overflow Area

When creating an index-organized table, you can specify a separate segment as a row overflow area. In index-organized tables, B-tree index entries can be large because they contain an entire row, so a separate segment to contain the entries is useful. In contrast, B-tree entries are usually small because they consist of the key and rowid.

If a row overflow area is specified, then the database can divide a row in an index-organized table into the following parts:

（1）The index entry

This part contains column values for all the primary key columns, a physical rowid that points to the overflow part of the row, and optionally a few of the non-key columns. This part is stored in the index segment.

（2）The overflow part

This part contains column values for the remaining non-key columns. This part is stored in the overflow storage area segment.

Secondary Indexes on Index-Organized Tables

A secondary index is an index on an index-organized table. In a sense, it is an index on an index. The secondary index is an independent schema object and is stored separately from the index-organized table.

As explained in "Rowid Data Types", Oracle Database uses row identifiers called logical rowids for index-organized tables. A logical rowid is a base64-encoded representation of the table primary key. The logical rowid length depends on the primary key length.

Rows in index leaf blocks can move within or between blocks because of insertions. Rows in index-organized tables do not migrate as heap-organized rows do (see "Chained and Migrated Rows"). Because rows in index-organized tables do not have permanent physical addresses, the database uses logical rowids based on primary key.

For example, assume that the departments table is index-organized. The location_id column stores the ID of each department. The table stores rows as follows, with the last value as the location ID:

10,Administration,200,1700

20,Marketing,201,1800

30,Purchasing,114,1700

40,Human Resources,203,2400

A secondary index on the location_id column might have index entries as follows, where the value following the comma is the logical rowid:

1700,*BAFAJqoCwR/+

1700,*BAFAJqoCwQv+

1800,*BAFAJqoCwRX+

2400,*BAFAJqoCwSn+

Secondary indexes provide fast and efficient access to index-organized tables using columns that are neither the primary key nor a prefix of the primary key. For example, a query of the names of departments whose ID is greater than 1700 could use the secondary index to speed data access.

Logical Rowids and Physical Guesses

Secondary indexes use the logical rowids to locate table rows. A logical rowid includes a physical guess, which is the physical rowid of the index entry when it was first made. Oracle Database can use physical guesses to probe directly into the leaf block of the index-organized table, bypassing the primary key search. When the physical location of a row changes, the logical rowid remains valid even if it contains a physical guess that is stale.

For a heap-organized table, access by a secondary index involves a scan of the secondary index and an additional I/O to fetch the data block containing the row. For index-organized tables, access by a secondary index varies, depending on the use and accuracy of physical guesses:

1. Without physical guesses, access involves two index scans: a scan of the secondary index followed by a scan of the primary key index.

2. With physical guesses, access depends on their accuracy:

（1）With accurate physical guesses, access involves a secondary index scan and an additional I/O to fetch the data block containing the row.

（2）With inaccurate physical guesses, access involves a secondary index scan and an I/O to fetch the wrong data block (as indicated by the guess), followed by an index unique scan of the index organized table by primary key value.

Bitmap Indexes on Index-Organized Tables

A secondary index on an index-organized table can be a bitmap index. As explained in "Bitmap Indexes", a bitmap index stores a bitmap for each index key.

When bitmap indexes exist on an index-organized table, all the bitmap indexes use a heap-organized mapping table. The mapping table stores the logical rowids of the index-organized table. Each mapping table row stores one logical rowid for the corresponding index-organized table row.

The database accesses a bitmap index using a search key. If the database finds the key, then the bitmap entry is converted to a physical rowid.

With heap-organized tables, the database uses the physical rowid to access the base table.

With index-organized tables, the database uses the physical rowid to access the mapping table, which in turn yields a logical rowid that the database uses to access the index-organized table. Figure 3-4 illustrates index access for a query of the departments_iot table.

Figure 3-4 Bitmap Index on Index-Organized Table