Index Scans

本文详细介绍了Oracle数据库中各种类型的索引扫描操作,包括索引扫描的基本原理、全索引扫描、快速全索引扫描、索引范围扫描、索引唯一扫描、索引跳过扫描等,并探讨了索引聚集因子的概念及其对查询性能的影响。

Index Scans

In an index scan, the database retrieves a row by traversing the index, using the indexed column values specified by the statement. If the database scans the index for a value, then it will find this value in n I/Os where n is the height of the B-tree index. This is the basic principle behind Oracle Database indexes.

If a SQL statement accesses only indexed columns, then the database reads values directly from the index rather than from the table. If the statement accesses columns in addition to the indexed columns, then the database uses rowids to find the rows in the table. Typically, the database retrieves table data by alternately reading an index block and then a table block.

Full Index Scan
In a full index scan, the database reads the entire index in order. A full index scan is available if a predicate (WHERE clause) in the SQL statement references a column in the index, and in some circumstances when no predicate is specified. A full scan can eliminate sorting because the data is ordered by index key.
Suppose that an application runs the following query:

SELECT department_id, last_name, salary
FROM employees
WHERE salary > 5000
ORDER BY department_id, last_name;



Also assume that department_id, last_name, and salary are a composite key in an index. Oracle Database performs a full scan of the index, reading it in sorted order (ordered by department ID and last name) and filtering on the salary attribute. In this way, the database scans a set of data smaller than the employees table, which contains more columns than are included in the query, and avoids sorting the data.
For example, the full scan could read the index entries as follows:

50,Atkinson,2800,rowid
60,Austin,4800,rowid
70,Baer,10000,rowid

80,Abel,11000,rowid
80,Ande,6400,rowid
110,Austin,7200,rowid
.
.
.


Fast Full Index Scan
A fast full index scan is a full index scan in which the database accesses the data in the index itself without accessing the table, and the database reads the index blocks in no particular order.
Fast full index scans are an alternative to a full table scan when both of the following conditions are met:
■The index must contain all columns needed for the query.
■A row containing all nulls must not appear in the query result set. For this result to be guaranteed, at least one column in the index must have either:
–A NOT NULL constraint
–A predicate applied to it that prevents nulls from being considered in the query result set
For example, an application issues the following query, which does not include an ORDER BY clause:

SELECT last_name, salary
FROM employees;



The last_name column has a not null constraint. If the last name and salary are a composite key in an index, then a fast full index scan can read the index entries to obtain the requested information:
Baida,2900,rowid
Zlotkey,10500,rowid
Austin,7200,rowid
Baer,10000,rowid
Atkinson,2800,rowid
Austin,4800,rowid
.
.
.


Index Range Scan
An index range scan is an ordered scan of an index that has the following characteristics:
■One or more leading columns of an index are specified in conditions. A condition specifies a combination of one or more expressions and logical (Boolean) operators and returns a value of TRUE, FALSE, or UNKNOWN.
■0, 1, or more values are possible for an index key.
The database commonly uses an index range scan to access selective data. The selectivity is the percentage of rows in the table that the query selects, with 0 meaning no rows and 1 meaning all rows. Selectivity is tied to a query predicate, such as WHERE last_name LIKE 'A%', or a combination of predicates. A predicate becomes more selective as the value approaches 0 and less selective (or more unselective) as the value approaches 1.
For example, a user queries employees whose last names begin with A. Assume that the last_name column is indexed, with entries as follows:


Abel,rowid
Ande,rowid
Atkinson,rowid
Austin,rowid
Austin,rowid
Baer,rowid
.
.
.
The database could use a range scan because the last_name column is specified in the predicate and multiples rowids are possible for each index key. For example, two employees are named Austin, so two rowids are associated with the key Austin.
An index range scan can be bounded on both sides, as in a query for departments with IDs between 10 and 40, or bounded on only one side, as in a query for IDs over 40. To scan the index, the database moves backward or forward through the leaf blocks. For example, a scan for IDs between 10 and 40 locates the first index leaf block that contains the lowest key value that is 10 or greater. The scan then proceeds horizontally through the linked list of leaf nodes until it locates a value greater than 40.

Index Unique Scan
In contrast to an index range scan, an index unique scan must have either 0 or 1 rowid associated with an index key. The database performs a unique scan when a predicate references all of the columns in a UNIQUE index key using an equality operator. An index unique scan stops processing as soon as it finds the first record because no second record is possible.
As an illustration, suppose that a user runs the following query:


SELECT *
FROM employees
WHERE employee_id = 5;




Assume that the employee_id column is the primary key and is indexed with entries as follows:
1,rowid
2,rowid
4,rowid
5,rowid
6,rowid
.
.
.

In this case, the database can use an index unique scan to locate the rowid for the employee whose ID is 5.

Index Skip Scan
An index skip scan uses logical subindexes of a composite index. The database "skips" through a single index as if it were searching separate indexes. Skip scanning is beneficial if there are few distinct values in the leading column of a composite index and many distinct values in the nonleading key of the index.
The database may choose an index skip scan when the leading column of the composite index is not specified in a query predicate. For example, assume that you run the following query for a customer in the sh.customers table:

SELECT * FROM sh.customers WHERE cust_email = 'Abbey@company.com';



The customers table has a column cust_gender whose values are either M or F. Assume that a composite index exists on the columns (cust_gender, cust_email). Example 3–1 shows a portion of the index entries.

The database can use a skip scan of this index even though cust_gender is not specified in the WHERE clause.
In a skip scan, the number of logical subindexes is determined by the number of distinct values in the leading column. In Example 3–1, the leading column has two possible values. The database logically splits the index into one subindex with the key F and a second subindex with the key M.
When searching for the record for the customer whose email is Abbey@company.com, the database searches the subindex with the value F first and then searches the subindex with the value M. Conceptually, the database processes the query as follows:


SELECT * FROM sh.customers WHERE cust_gender = 'F'
AND cust_email = 'Abbey@company.com'
UNION ALL
SELECT * FROM sh.customers WHERE cust_gender = 'M'
AND cust_email = 'Abbey@company.com';



Index Clustering Factor
The index clustering factor measures row order in relation to an indexed value such as employee last name. The more order that exists in row storage for this value, the lower the clustering factor.
The clustering factor is useful as a rough measure of the number of I/Os required to read an entire table by means of an index:
■If the clustering factor is high, then Oracle Database performs a relatively high number of I/Os during a large index range scan. The index entries point to random table blocks, so the database may have to read and reread the same blocks over and over again to retrieve the data pointed to by the index.
■If the clustering factor is low, then Oracle Database performs a relatively low number of I/Os during a large index range scan. The index keys in a range tend to point to the same data block, so the database does not have to read and reread the same blocks over and over.
The clustering factor is relevant for index scans because it can show:
■Whether the database will use an index for large range scans
■The degree of table organization in relation to the index key
■Whether you should consider using an index-organized table, partitioning, or table cluster if rows must be ordered by the index key

For example, assume that the employees table fits into two data blocks. Table 3–1 depicts the rows in the two data blocks (the ellipses indicate data that is not shown).


Rows are stored in the blocks in order of last name (shown in bold). For example, the bottom row in data block 1 describes Abel, the next row up describes Ande, and so on alphabetically until the top row in block 1 for Steven King. The bottom row in block 2 describes Kochar, the next row up describes Kumar, and so on alphabetically until the last row in the block for Zlotkey.
Assume that an index exists on the last name column. Each name entry corresponds to a rowid. Conceptually, the index entries would look as follows:
Abel,block1row1
Ande,block1row2
Atkinson,block1row3
Austin,block1row4
Baer,block1row5
.
.
.
Assume that a separate index exists on the employee ID column. Conceptually, the index entries might look as follows, with employee IDs distributed in almost random locations throughout the two blocks:

100,block1row50
101,block2row1
102,block1row9
103,block2row19
104,block2row39
105,block1row4
.
.
.
Example 3–2 queries the ALL_INDEXES view for the clustering factor for these two indexes. The clustering factor for EMP_NAME_IX is low, which means that adjacent index entries in a single leaf block tend to point to rows in the same data blocks. The clustering factor for EMP_EMP_ID_PK is high, which means that adjacent index entries in the same leaf block are much less likely to point to rows in the same data blocks.


Example 3–2 Clustering Factor
SQL> SELECT INDEX_NAME, CLUSTERING_FACTOR
 FROM ALL_INDEXES  WHERE INDEX_NAME IN ('EMP_NAME_IX','EMP_EMP_ID_PK');
INDEX_NAME CLUSTERING_FACTOR
-------------------- -----------------
EMP_EMP_ID_PK 19
EMP_NAME_IX 2




# Other default tuning values # MySQL Server Instance Configuration File # ---------------------------------------------------------------------- # Generated by the MySQL Server Instance Configuration Wizard # # # Installation Instructions # ---------------------------------------------------------------------- # # On Linux you can copy this file to /etc/my.cnf to set global options, # mysql-data-dir/my.cnf to set server-specific options # (@localstatedir@ for this installation) or to # ~/.my.cnf to set user-specific options. # # On Windows, when MySQL has been installed using MySQL Installer you # should keep this file in the ProgramData directory of your server # (e.g. C:\ProgramData\MySQL\MySQL Server X.Y). To make sure the server # reads the config file, use the startup option "--defaults-file". # # To run the server from the command line, execute this in a # command line shell, e.g. # mysqld --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # # To install the server as a Windows service manually, execute this in a # command line shell, e.g. # mysqld --install MySQLXY --defaults-file="C:\Program Files\MySQL\MySQL Server X.Y\my.ini" # # And then execute this in a command line shell to start the server, e.g. # net start MySQLXY # # # Guidelines for editing this file # ---------------------------------------------------------------------- # # In this file, you can use all long options that the program supports. # If you want to know the options a program supports, start the program # with the "--help" option. # # More detailed information about the individual options can also be # found in the manual. # # For advice on how to change settings please see # https://dev.mysql.com/doc/refman/8.0/en/server-configuration-defaults.html # # # CLIENT SECTION # ---------------------------------------------------------------------- # # The following options will be read by MySQL client applications. # Note that only client applications shipped by MySQL are guaranteed # to read this section. If you want your own MySQL client program to # honor these values, you need to specify it as an option during the # MySQL client library initialization. # [client] # pipe= # socket=MYSQL port=3306 [mysql] no-beep # default-character-set= # SERVER SECTION # ---------------------------------------------------------------------- # # The following options will be read by the MySQL Server. Make sure that # you have installed the server correctly (see above) so it reads this # file. # [mysqld] port = 3306 bind-address = 0.0.0.0 # The next three options are mutually exclusive to SERVER_PORT below. # skip-networking # enable-named-pipe # shared-memory # shared-memory-base-name=MYSQL # The Pipe the MySQL Server will use. # socket=MYSQL # The access control granted to clients on the named pipe created by the MySQL Server. # named-pipe-full-access-group= # The TCP/IP Port the MySQL Server will listen on port=3306 # Path to installation directory. All paths are usually resolved relative to this. # basedir="D:/mysql" # Path to the database root datadir=D:/mysql\Data # The default character set that will be used when a new schema or table is # created and no character set is defined # character-set-server= # The default storage engine that will be used when create new tables when default-storage-engine=INNODB # The current server SQL mode, which can be set dynamically. # Modes affect the SQL syntax MySQL supports and the data validation checks it performs. This # makes it easier to use MySQL in different environments and to use MySQL together with other # database servers. sql-mode="ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION" # General and Slow logging. log-output=FILE general-log=0 general_log_file="WIN-20240617SLP.log" slow-query-log=1 slow_query_log_file="WIN-20240617SLP-slow.log" long_query_time=10 # Error Logging. log-error="WIN-20240617SLP.err" # ***** Group Replication Related ***** # Specifies the base name to use for binary log files. With binary logging # enabled, the server logs all statements that change data to the binary # log, which is used for backup and replication. log-bin="WIN-20240617SLP-bin" # ***** Group Replication Related ***** # Specifies the server ID. For servers that are used in a replication topology, # you must specify a unique server ID for each replication server, in the # range from 1 to 2^32 − 1. "Unique" means that each ID must be different # from every other ID in use by any other source or replica. server-id=1 # Indicates how table and database names are stored on disk and used in MySQL. # Value 0 = Table and database names are stored on disk using the lettercase specified in the CREATE # TABLE or CREATE DATABASE statement. Name comparisons are case-sensitive. You should not # set this variable to 0 if you are running MySQL on a system that has case-insensitive file # names (such as Windows or macOS). If you force this variable to 0 with # --lower-case-table-names=0 on a case-insensitive file system and access MyISAM tablenames # using different lettercases, index corruption may result. # Value 1 = Table names are stored in lowercase on disk and name comparisons are not case-sensitive. # MySQL converts all table names to lowercase on storage and lookup. This behavior also applies # to database names and table aliases. # Value 2 = Table and database names are stored on disk using the lettercase specified in the CREATE TABLE # or CREATE DATABASE statement, but MySQL converts them to lowercase on lookup. Name comparisons # are not case-sensitive. This works only on file systems that are not case-sensitive! InnoDB # table names and view names are stored in lowercase, as for lower_case_table_names=1. lower_case_table_names=1 # This variable is used to limit the effect of data import and export operations, such as # those performed by the LOAD DATA and SELECT ... INTO OUTFILE statements and the # LOAD_FILE() function. These operations are permitted only to users who have the FILE privilege. secure-file-priv="D:/mysql/Uploads" # The maximum amount of concurrent sessions the MySQL server will # allow. One of these connections will be reserved for a user with # SUPER privileges to allow the administrator to login even if the # connection limit has been reached. max_connections=151 # The number of open tables for all threads. Increasing this value increases the number # of file descriptors that mysqld requires. table_open_cache=4000 # Defines the maximum amount of memory that can be occupied by the TempTable # storage engine before it starts storing data on disk. temptable_max_ram=1G # Defines the maximum size of internal in-memory temporary tables created # by the MEMORY storage engine and, as of MySQL 8.0.28, the TempTable storage # engine. If an internal in-memory temporary table exceeds this size, it is # automatically converted to an on-disk internal temporary table. tmp_table_size=128M # The storage engine for in-memory internal temporary tables (see Section 8.4.4, "Internal # Temporary Table Use in MySQL"). Permitted values are TempTable (the default) and MEMORY. internal_tmp_mem_storage_engine=TempTable #*** MyISAM Specific options # The maximum size of the temporary file that MySQL is permitted to use while re-creating a # MyISAM index (during REPAIR TABLE, ALTER TABLE, or LOAD DATA). If the file size would be # larger than this value, the index is created using the key cache instead, which is slower. # The value is given in bytes. myisam_max_sort_file_size=2146435072 # The size of the buffer that is allocated when sorting MyISAM indexes during a REPAIR TABLE # or when creating indexes with CREATE INDEX or ALTER TABLE. myisam_sort_buffer_size=245M # Size of the Key Buffer, used to cache index blocks for MyISAM tables. # Do not set it larger than 30% of your available memory, as some memory # is also required by the OS to cache rows. Even if you're not using # MyISAM tables, you should still set it to 8-64M as it will also be # used for internal temporary disk tables. key_buffer_size=8M # Each thread that does a sequential scan for a MyISAM table allocates a buffer # of this size (in bytes) for each table it scans. If you do many sequential # scans, you might want to increase this value, which defaults to 131072. The # value of this variable should be a multiple of 4KB. If it is set to a value # that is not a multiple of 4KB, its value is rounded down to the nearest multiple # of 4KB. read_buffer_size=128K # This variable is used for reads from MyISAM tables, and, for any storage engine, # for Multi-Range Read optimization. read_rnd_buffer_size=256K #*** INNODB Specific options *** # innodb_data_home_dir= # Use this option if you have a MySQL server with InnoDB support enabled # but you do not plan to use it. This will save memory and disk space # and speed up some things. # skip-innodb # If set to 1, InnoDB will flush (fsync) the transaction logs to the # disk at each commit, which offers full ACID behavior. If you are # willing to compromise this safety, and you are running small # transactions, you may set this to 0 or 2 to reduce disk I/O to the # logs. Value 0 means that the log is only written to the log file and # the log file flushed to disk approximately once per second. Value 2 # means the log is written to the log file at each commit, but the log # file is only flushed to disk approximately once per second. innodb_flush_log_at_trx_commit=1 # The size in bytes of the buffer that InnoDB uses to write to the log files on # disk. The default value changed from 8MB to 16MB with the introduction of 32KB # and 64KB innodb_page_size values. A large log buffer enables large transactions # to run without the need to write the log to disk before the transactions commit. # Thus, if you have transactions that update, insert, or delete many rows, making # the log buffer larger saves disk I/O. innodb_log_buffer_size=16M # The size in bytes of the buffer pool, the memory area where InnoDB caches table # and index data. The default value is 134217728 bytes (128MB). The maximum value # depends on the CPU architecture; the maximum is 4294967295 (232-1) on 32-bit systems # and 18446744073709551615 (264-1) on 64-bit systems. On 32-bit systems, the CPU # architecture and operating system may impose a lower practical maximum size than the # stated maximum. When the size of the buffer pool is greater than 1GB, setting # innodb_buffer_pool_instances to a value greater than 1 can improve the scalability on # a busy server. innodb_buffer_pool_size=128M # Defines the amount of disk space occupied by redo log files. This variable supersedes the # innodb_log_files_in_group and innodb_log_file_size variables. innodb_redo_log_capacity=100M # Defines the maximum number of threads permitted inside of InnoDB. A value # of 0 (the default) is interpreted as infinite concurrency (no limit). This # variable is intended for performance tuning on high concurrency systems. # InnoDB tries to keep the number of threads inside InnoDB less than or equal to # the innodb_thread_concurrency limit. Once the limit is reached, additional threads # are placed into a "First In, First Out" (FIFO) queue for waiting threads. Threads # waiting for locks are not counted in the number of concurrently executing threads. innodb_thread_concurrency=25 # The increment size (in MB) for extending the size of an auto-extend InnoDB system tablespace file when it becomes full. innodb_autoextend_increment=64 # The number of regions that the InnoDB buffer pool is divided into. # For systems with buffer pools in the multi-gigabyte range, dividing the buffer pool into separate instances can improve concurrency, # by reducing contention as different threads read and write to cached pages. innodb_buffer_pool_instances=8 # Determines the number of threads that can enter InnoDB concurrently. innodb_concurrency_tickets=5000 # Specifies how long in milliseconds (ms) a block inserted into the old sublist must stay there after its first access before # it can be moved to the new sublist. innodb_old_blocks_time=1000 # When this variable is enabled, InnoDB updates statistics during metadata statements. innodb_stats_on_metadata=0 # When innodb_file_per_table is enabled (the default in 5.6.6 and higher), InnoDB stores the data and indexes for each newly created table # in a separate .ibd file, rather than in the system tablespace. innodb_file_per_table=1 # Use the following list of values: 0 for crc32, 1 for strict_crc32, 2 for innodb, 3 for strict_innodb, 4 for none, 5 for strict_none. innodb_checksum_algorithm=0 # If this is set to a nonzero value, all tables are closed every flush_time seconds to free up resources and # synchronize unflushed data to disk. # This option is best used only on systems with minimal resources. flush_time=0 # The minimum size of the buffer that is used for plain index scans, range index scans, and joins that do not use # indexes and thus perform full table scans. join_buffer_size=256K # The maximum size of one packet or any generated or intermediate string, or any parameter sent by the # mysql_stmt_send_long_data() C API function. max_allowed_packet=64M # If more than this many successive connection requests from a host are interrupted without a successful connection, # the server blocks that host from performing further connections. max_connect_errors=100 # The number of file descriptors available to mysqld from the operating system # Try increasing the value of this option if mysqld gives the error "Too many open files". open_files_limit=8161 # If you see many sort_merge_passes per second in SHOW GLOBAL STATUS output, you can consider increasing the # sort_buffer_size value to speed up ORDER BY or GROUP BY operations that cannot be improved with query optimization # or improved indexing. sort_buffer_size=256K # Specify the maximum size of a row-based binary log event, in bytes. # Rows are grouped into events smaller than this size if possible. The value should be a multiple of 256. binlog_row_event_max_size=8K # If the value of this variable is greater than 0, a replica synchronizes its master.info file to disk. # (using fdatasync()) after every sync_source_info events. sync_source_info=10000 # If the value of this variable is greater than 0, the MySQL server synchronizes its relay log to disk. # (using fdatasync()) after every sync_relay_log writes to the relay log. sync_relay_log=10000 # Load mysql plugins at start."plugin_x ; plugin_y". # plugin_load # The TCP/IP Port the MySQL Server X Protocol will listen on. 这就是配置文件内容
10-03
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值