About Hash Clusters
Oracle Database uses a hash function to generate a distribution of numeric values, called hash values, that are based on specific cluster key values. The key of a hash cluster, like the key of an index cluster, can be a single column or composite key (multiple column key). To find or store a row in a hash cluster, the database applies the hash function to the cluster key value of the row. The resulting hash value corresponds to a data block in the cluster, which the database then reads or writes on behalf of the issued statement.
oracle用hash函数生成hash值,hash值是基于特定的聚簇值的影响。数据库用hash value对应相应的数据块。
With an indexed table or indexed cluster, Oracle Database locates table rows using key values stored in a separate index. To find or store a row in an indexed table or table cluster, the database must perform at least two I/Os:
One or more I/Os to find or store the key value in the index
Another I/O to read or write the row in the table or table cluster
Hashing is an optional way of storing table data to improve the performance of data retrieval.
Hash clusters may be beneficial when the following conditions are met:
A table is queried much more often than modified.
The hash key column is queried frequently with equality conditions, for example, WHERE department_id=20. For such queries, the cluster key value is hashed. The hash key value points directly to the disk area that stores the rows.
You can reasonably guess the number of hash keys and the size of the data stored with each key value.
CREATE CLUSTER employees_departments_cluster
(department_id NUMBER(4))
SIZE 8192 HASHKEYS 100;
Hash Cluster Storage
Oracle Database allocates space for a hash cluster differently from an indexed cluster. In Example 2-9, HASHKEYS specifies the number of departments likely to exist, whereas SIZE specifies the size of the data associated with each department. The database computes a storage space value based on the following formula:
HASHKEYS * SIZE / database_block_size
Thus, if the block size is 4096 bytes in Example 2-9, then the database allocates at least 200 blocks to the hash cluster.
Oracle Database does not limit the number of hash key values that you can insert into the cluster. For example, even though HASHKEYS is 100, nothing prevents you from inserting 200 unique departments in the departments table. However, the efficiency of the hash cluster retrieval diminishes when the number of hash values exceeds the number of hash keys.
To illustrate the retrieval issues, assume that block 100 is completely full with rows for department 20. A user inserts a new department with department_id 43 into the departments table. The number of departments exceeds the HASHKEYS value, so the database hashes department_id 43 to hash value 77, which is the same hash value used for department_id 20. Hashing multiple input values to the same output value is called a hash collision.
When users insert rows into the cluster for department 43, the database cannot store these rows in block 100, which is full. The database links block 100 to a new overflow block, say block 200, and stores the inserted rows in the new block. Both block 100 and 200 are now eligible to store data for either department. As shown in Figure 2-8, a query of either department 20 or 43 now requires two I/Os to retrieve the data: block 100 and its associated block 200. You can solve this problem by re-creating the cluster with a different HASHKEYS value.
Oracle Database uses a hash function to generate a distribution of numeric values, called hash values, that are based on specific cluster key values. The key of a hash cluster, like the key of an index cluster, can be a single column or composite key (multiple column key). To find or store a row in a hash cluster, the database applies the hash function to the cluster key value of the row. The resulting hash value corresponds to a data block in the cluster, which the database then reads or writes on behalf of the issued statement.
oracle用hash函数生成hash值,hash值是基于特定的聚簇值的影响。数据库用hash value对应相应的数据块。
With an indexed table or indexed cluster, Oracle Database locates table rows using key values stored in a separate index. To find or store a row in an indexed table or table cluster, the database must perform at least two I/Os:
One or more I/Os to find or store the key value in the index
Another I/O to read or write the row in the table or table cluster
Hashing is an optional way of storing table data to improve the performance of data retrieval.
Hash clusters may be beneficial when the following conditions are met:
A table is queried much more often than modified.
The hash key column is queried frequently with equality conditions, for example, WHERE department_id=20. For such queries, the cluster key value is hashed. The hash key value points directly to the disk area that stores the rows.
You can reasonably guess the number of hash keys and the size of the data stored with each key value.
Hash Cluster Creation
Example 2-9 Hash ClusterCREATE CLUSTER employees_departments_cluster
(department_id NUMBER(4))
SIZE 8192 HASHKEYS 100;

Hash Cluster Storage
Oracle Database allocates space for a hash cluster differently from an indexed cluster. In Example 2-9, HASHKEYS specifies the number of departments likely to exist, whereas SIZE specifies the size of the data associated with each department. The database computes a storage space value based on the following formula:
HASHKEYS * SIZE / database_block_size
Thus, if the block size is 4096 bytes in Example 2-9, then the database allocates at least 200 blocks to the hash cluster.
Oracle Database does not limit the number of hash key values that you can insert into the cluster. For example, even though HASHKEYS is 100, nothing prevents you from inserting 200 unique departments in the departments table. However, the efficiency of the hash cluster retrieval diminishes when the number of hash values exceeds the number of hash keys.
To illustrate the retrieval issues, assume that block 100 is completely full with rows for department 20. A user inserts a new department with department_id 43 into the departments table. The number of departments exceeds the HASHKEYS value, so the database hashes department_id 43 to hash value 77, which is the same hash value used for department_id 20. Hashing multiple input values to the same output value is called a hash collision.
When users insert rows into the cluster for department 43, the database cannot store these rows in block 100, which is full. The database links block 100 to a new overflow block, say block 200, and stores the inserted rows in the new block. Both block 100 and 200 are now eligible to store data for either department. As shown in Figure 2-8, a query of either department 20 or 43 now requires two I/Os to retrieve the data: block 100 and its associated block 200. You can solve this problem by re-creating the cluster with a different HASHKEYS value.

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/26844646/viewspace-777263/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/26844646/viewspace-777263/