How to measure Index Selectivity

本文详细解释了数据库索引的选择性概念,包括如何计算索引选择性及其对查询性能的影响。通过实例展示了高选择性和低选择性的索引,并介绍了如何手动和自动测量索引选择性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

 

 

 

 

下面的文章不错,大概总结一下所谓索引的selectivity 实际就是 索引列中非重复值/列中所有的记录 =selectivity

这个值范围是在 0~1,选择性越高那么意味着优化器会优先选择索引方式作为访问路径而生成执行计划

一个good selectivity

  100'000条记录,其中索引列中有88000的非重复值 那么selectivity = 88'000/100'000=0.88 高选择性,返回很少rowid

一个bad selectivity

  100'000条记录,其中索引列中有500个非重复值那么选择性是 500/100'000=0.055 低选择性

  这就意味着 100'000/500 = 200 也就是一条索引列值要检索出200条记录,低选择性会返回很多rowid,不如用全表扫描 了

 

Zurück

How to measure Index Selectivity

Index Selectivity

B*TREE Indexes improve the performance of queries that select a small percentage of rows from a table. As a general guideline, we should create indexes on tables that are often queried for less than 15% of the table's rows. This value may be higher in situations where all data can be retrieved from an index, or where the indexed columns can be used for joining to other tables.

The ratio of the number of distinct values in the indexed column / columns to the number of records in the table represents the selectivity of an index. The ideal selectivity is 1. Such a selectivity can be reached only by unique indexes on NOT NULL columns.

Example with good Selectivity

A table having 100'000 records and one of its indexed column has 88000 distinct values, then the selectivity of this index is 88'000 / 10'0000 = 0.88.

Oracle implicitly creates indexes on the columns of all unique and primary keys that you define with integrity constraints. These indexes are the most selective and the most effective in optimizing performance. The selectivity of an index is the percentage of rows in a table having the same value for the indexed column. An index's selectivity is good if few rows have the same value.

Example with bad Selectivity

lf an index on a table of 100'000 records had only 500 distinct values, then the index's selectivity is 500 / 100'000 = 0.005 and in this case a query which uses the limitation of such an index will retum 100'000 / 500 = 200 records for each distinct value. It is evident that a full table scan is more efficient as using such an index where much more I/O is needed to scan repeatedly the index and the table.

How to Measure Index Selectivity ?

Manually measure index selectivity

The ratio of the number of distinct values to the total number of rows is the selectivity of the columns. This method is useful to estimate the selectivity of an index before creating it.

select count (distinct job) "Distinct Values" from emp;

Distinct Values
---------------
              5

select count(*) "Total Number Rows" from emp;

Total Number Rows
-----------------
               14

Selectivity = Distinct Values / Total Number Rows
            = 5 / 14
            = 0.35

Automatically measure index selectivity

We can determine the selectivity of an index by dividing the number of distinct indexed values by the number of rows in the table.

create index idx_emp_job on emp(job);
analyze table emp compute statistics;

select distinct_keys from user_indexes
where table_name = 'EMP'

   and index_name = 'IDX_EMP_JOB';

DISTINCT_KEYS
-------------
            5

select num_rows from user_tables
where table_name = 'EMP';

NUM_ROWS
---------
       14

Selectivity = DISTINCT_KEYS / NUM_ROWS = 0.35

Selectivity of each individual Column

Assuming that the table has been analyzed it is also possible to query USER_TAB_COLUMNS to investigate the selectivity of each column individually.

select column_name, num_distinct
from user_tab_columns
where table_name = 'EMP';

COLUMN_NAME                     NUM_DISTINCT
------------------------------ ------------
EMPNO                                     14
ENAME                                     14
JOB                                        5
MGR                                        2
HIREDATE                                  13
SAL                                       12
COMM                                       4
DEPTNO                                     3

How to choose Composite Indexes ?

A composite index contains more than one key column. Composite indexes can provide additional advantages over single column indexes.

Better Selectivity Sometimes two or more columns, each with poor selectivity, can be combined to form a composite index with good selectivity.
Adding Data Storage If all the columns selected by the query are in the composite index, Oracle can return these values from the index without accessing the table. However in this case, it's better to use an IOT (Index Only Table).

An SQL statement can use an access path involving a composite index if the statement contains constructs that use a leading portion of the index. A leading portion of an index is a set of one or more columns that were specified first and consecutively in the list of columns in the CREATE INDEX statement that created the index. Consider this CREATE INDEX statement:

CREATE INDEX idx_composite ON mytab (x, y, z);

These combinations of columns are leading portions of the index: X, XY, and XYZ. These combinations of columns are not leading portions of the index: YZ and Z.

Guidelines for choosing columns for composite indexes

Consider creating a composite index on columns that are frequently used together in WHERE clause conditions combined with AND operators, especially if their combined selectivity is better than the selectivity of either column individually. Consider indexing columns that are used frequently to join tables in SQL statements.

To test the measure power consumption feature of the PlugSim class, you can create an instance of the class and call the measurePower() method multiple times with different inputs. You can also check the power field of the instance to see if it has been updated correctly after calling the measurePower() method. For example, you can create a test case that turns on the plug, calls the measurePower() method multiple times, and checks if the power field is within a reasonable range. You can also create a test case that turns off the plug, calls the measurePower() method, and checks if the power field is zero. Here's an example test case using JUnit 5 framework: ``` import org.junit.jupiter.api.Test; import static org.junit.jupiter.api.Assertions.*; public class PlugSimTest { @Test public void testMeasurePower() { PlugSim plug = new PlugSim("Test Plug"); plug.switchOn(); for (int i = 0; i < 10; i++) { plug.measurePower(); double power = plug.getPower(); assertTrue(power >= 0 && power <= 400, "Power reading out of range: " + power); } } @Test public void testMeasurePowerWhenOff() { PlugSim plug = new PlugSim("Test Plug"); plug.switchOff(); plug.measurePower(); double power = plug.getPower(); assertEquals(0, power, "Power reading should be zero when plug is off"); } } ``` In the first test case, we create a new PlugSim instance, turn it on, and call the measurePower() method 10 times. We check if the power reading is within the range of 0 to 400 watts, which is a reasonable range for a typical household appliance. In the second test case, we turn off the plug, call the measurePower() method, and check if the power reading is zero.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值