MultiColumn Statistics

本文介绍如何使用Oracle DBMS_STATS包创建多列统计组,并通过示例展示了如何收集和获取这些统计信息,从而帮助优化器更准确地评估查询选择性。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

   When the WHERE clause of a query specifies multiple columns from a single table (multiple single column predicates), the relationship between the columns can strongly affect the combined selectivity for the column group.For example, consider the customers table in the SH schema. The columns cust_state_province and country_id are related, with cust_state_province determining the country_id for each customer. Suppose you query the customers table where the cust_state_province is California:
SQL> SELECT COUNT(*)
  2  FROM   sh.customers
  3  WHERE  cust_state_province = 'CA';

  COUNT(*)
----------
      3341

SQL>  SELECT COUNT(*)
  2    FROM   sh.customers
  3  WHERE  cust_state_province = 'CA'
  4   AND    country_id=52790;

  COUNT(*)
----------
      3341
SQL>  SELECT COUNT(*)
  2  FROM   sh.customers
  3  WHERE  cust_state_province = 'CA'
  4   AND    country_id=52775;

  COUNT(*)
----------
         0

With individual column statistics, the optimizer has no way of knowing that the cust_state_province and the country_id columns are related. By gathering statistics on these columns as a group (column group), the optimizer has a more accurate selectivity value for the group, instead of having to generate the value based on the individual column statistics.

You can create column groups manually by using the DBMS_STATS package. You can use this package to create a column group, get the name of a column group, or delete a column group from a table.


1
Creating a Column Group

Use the create_extended_statistics function to create a column group. The create_extended_statistics function returns the system-generated name of the newly created column group. Table 1 lists the input parameters for this function.

Table 1 Parameters for the create_extended_statistics Function

Parameter Description

owner

Schema owner. NULL indicates current schema.

tab_name

Name of the table to which the column group is added.

extension

Columns in the column group.


For example, to add a column group consisting of the cust_state_province and country_id columns to the customers table in SH schema, run the following PL/SQL block:

SQL> DECLARE
  2    cg_name varchar2(30);
  3  BEGIN
  4    cg_name := dbms_stats.create_extended_stats('sh','customers',
  5               '(CUST_CITY,cust_state_province,country_id)');
  6  END;
  7  /

PL/SQL procedure successfully completed.

2 Getting a Column Group

Use the show_extended_stats_name function to obtain the name of the column group for a given set of columns. Table 2 lists the input parameters for this function.

Table 2 Parameters for the show_extended_stats_name Function

Parameter Description

owner

Schema owner. NULL indicates current schema.

tab_name

Name of the table to which the column group belongs.

extension

Name of the column group.


For example, use the following query to obtain the column group name for a set of columns on the customers table:

SQL> select sys.dbms_stats.show_extended_stats_name('sh','customers','(CUST_CITY,CUST_STATE_PROVINCE,COUNTRY_ID)') cg_name  from dual;

CG_NAME
--------------------------------------------------------------------------------
SYS_STUMZ$C3AIHLPBROI#SKA58H_N


3 Monitoring Column Groups

Use the dictionary table user_stat_extensions to obtain information about MultiColumn statistics:

SQL>  Select extension_name, extension  from dba_stat_extensions  where table_name='CUSTOMERS';

EXTENSION_NAME                 EXTENSION
------------------------------ --------------------------------------------------------------------------------
SYS_NC00028$                   (UPPER("CUST_LAST_NAME"))
SYS_NC00029$                   (UPPER("CUST_FIRST_NAME"))
SYS_STUMZ$C3AIHLPBROI#SKA58H_N ("CUST_CITY","CUST_STATE_PROVINCE","COUNTRY_ID")

Use the following query to find the number of distinct values and find whether a histogram has been created for a column group:
SQL> select e.extension col_group, t.num_distinct, t.histogram
  2     from dba_stat_extensions e, dba_tab_col_statistics t
  3     where e.extension_name=t.column_name
  4     and e.table_name=t.table_name
  5     and t.table_name='CUSTOMERS';

COL_GROUP                                                                        NUM_DISTINCT HISTOGRAM
-------------------------------------------------------------------------------- ------------ ---------------
("CUST_CITY","CUST_STATE_PROVINCE","COUNTRY_ID")                                          620 HEIGHT BALANCED
(UPPER("CUST_FIRST_NAME"))                                                                170 NONE
(UPPER("CUST_LAST_NAME"))                                                                 176 NONE


4 Gathering Statistics on Column Groups

The METHOD_OPT argument of the DBMS_STATS package enables you to gather statistics on column groups. If you set the value of this argument to FOR ALL COLUMNS SIZE AUTO, then the optimizer gathers statistics on all existing column groups. To collect statistics on a new column group, specify the group using FOR COLUMNS. The column group is automatically created as part of statistic gathering.

For example, the following statement creates a new column group for the customers table on the columns cust_state_province, country_id and gathers statistics (including histograms) on the entire table and the new column group:

SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','CUSTOMERS',METHOD_OPT =>'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS (CUST_CITY,CUST_STATE_PROVINCE,COUNTRY_ID) SIZE SKEWONLY');

PL/SQL procedure successfully completed.






来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/13750068/viewspace-2145015/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/13750068/viewspace-2145015/

内容概要:本文详细探讨了机组组合优化模型的构建,旨在通过合理安排各类发电机组的启停计划和优化出力分配,实现电力系统在经济性和稳定性上的最佳平衡。文章首先介绍了电力系统的四大主要组件——传统火电机组、风电机组、光伏机组和储能系统的参数及运行特性。接着,围绕最小化系统总运行成本这一目标,设计了优化目标函数,并明确了包括功率平衡约束、机组出力上下限约束、风光发电功率约束、弃风弃光约束、爬坡速率约束、储能系统荷电状态约束、充放电功率约束和充放电互斥约束在内的多项约束条件。最后,文章列出了求解机组组合优化模型所需的关键变量,如传统机组的开停状态、机组出力、启停成本、风电光伏实际出力、弃风弃光比例及储能系统的充放电功率和荷电状态,以实现系统的经济调度和可再生能源的最大化利用。 适合人群:从事电力系统研究、规划和调度工作的工程师和技术人员,以及对电力系统优化感兴趣的科研人员。 使用场景及目标:①帮助电力系统工程师理解不同类型发电机组的特点及其对系统稳定性、经济性和环保性的影响;②为制定合理的电力系统调度策略提供理论依据和技术支持;③促进可再生能源的有效整合,提高电力系统的灵活性和可靠性。 其他说明:本文提供的模型和方法不仅适用于当前的电力系统,也可为未来含高比例可再生能源接入的电力系统提供参考。文中涉及的具体数学公式和参数设定为实际应用提供了详细的指导,有助于提升电力系统的运行效率和经济效益。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值