200410 Introduction to databases (lecture 11)

  • 课件:lecture 11 Query Processing and Optimization
  • Reference material Database System Concepts, Seventh Edition - Chapter 16

在这里插入图片描述

Query Compilation

  • a) Parsing. A parse tree for the query is constructed.
  • b) Query Rewrite. The parse tree is converted to an initial query plan, which is usually an algebraic representation of the query. This initial plan is then transformed into an equivalent plan that is expected to require less time to execute.
  • c) Physical Plan Generation. The abstract query plan from b), often called a logical query plan, is turned into a physical query plan by selecting algorithms to implement each of the operators of the logical plan, and by selecting an order of execution for these operators. The physical plan, like the result of parsing and the logical plan, is represented by an expression tree. The physical plan also includes details such as how the queried relations are accessed, and when and if a relation should be sorted.

在这里插入图片描述

Grammar/Syntax Diagrams

在这里插入图片描述

BNF

描述编程语言的BNF

  • Grammar rules define valid SQL statements.
  • Query parser
    -Validates syntax.
    -Produces a logical parse tree/plan.
  • The optimizer and execution engine execute the plan.

SQL Parse Tree

在这里插入图片描述

Simple Parsing Example

在这里插入图片描述

Query Plan/Logical Plan

Simple Execution/Plan Tree

在这里插入图片描述

Relational Operations

在这里插入图片描述

Logical Plan

在这里插入图片描述

Optimization/Execution Overview

  • Two main issues in query optimization:
    -For a given query, what plans are considered?
    [Algorithm to search plan space for cheapest (estimated) plan.]
    -How is the cost of a plan estimated?
  • Ideally: Want to find best plan. Practically: Avoid worst plans!

Three Core Optimization Techniques

在这里插入图片描述

Query Execution Cost

在这里插入图片描述

Statistics and Catalogs

  • Need information about the relations and indexes involved. Catalogs typically contain at least:
    -# tuples (NTuples) and # pages (NPages) for each relation.
    -#distinct key values (NKeys) and NPages for each index.
    -Index height, low/high key values (Low/High) for each tree index.
  • Catalogs updated periodically.
    Updating whenever data changes is too expensive; lots of approximation anyway, so slight inconsistency ok.
  • More detailed information (e.g., histograms of the values in some field) are sometimes stored.

MySQL Example

在这里插入图片描述

The Catalog Contains

  • For each table
    -Table name, storage information.
    -Attribute name and type for each column.
    -Index name, type and indexed attributes
    -Constraints
  • Statistical information
    -Cardinality of each table
    -Size: No. of blocks.
    -Index cardinality: Number of distinct key values for each index
    -Index size: Number of blocks for each index.
    -Index tree height
    -Index ranges

Access Path

在这里插入图片描述

  • An access path is a method of retrieving tuples:
    File scan, or index that matches a selection (in the query)
  • A tree index matches (a conjunction of) terms that involve only attributes in a prefix of the search key.
    E.g., Tree index on <a, b, c> matches the selection a=5 AND b=3, and a=5 AND b>6, but not b=3.
  • A hash index matches (a conjunction of) terms that has a term attribute = value for every attribute in the search key of the index.
    E.g., Hash index on <a, b, c> matches a=5 AND b=3 AND c=5; but it does not match b=3, or a=5 AND b=3, or a>5 AND b=3 AND c=5.

Query Rewrite

Join : Index Nested Loops

在这里插入图片描述

Join : Sort - Merge

在这里插入图片描述

Example

在这里插入图片描述

Materialization

在这里插入图片描述

Pipelining

在这里插入图片描述


  • 课件:lecture 11 transactions and Recover
  • Reference material Database System Concepts, Seventh Edition - Chapter 17

Transactions and Recover

Core Transaction Concept - ACID Properties

在这里插入图片描述

Atomic

A transaction is a logical unit of work that must be either entirely completed or entirely undone. (All writes happen or none of them happen)

Simplistic Approach

在这里插入图片描述
在这里插入图片描述

DBMS ACID Implementation

在这里插入图片描述

Write Ahead Logging

在这里插入图片描述

在这里插入图片描述

NOTE:

  • force : force the block to the disk
  • steal : If I need memory, I will write an uncommitted update to the disk because I need to use this for some other transaction
  • No force requires you to do REDO processes during recovery; Steal requires you to do UNDO processes during recovery
  • summary : I’m going to REDO all the updates that committed, but we’re not written to disk and I’m going to UNDO all the updates that were written to disk but not committed.
ARIES recovery involves three passes
  • Analysis pass:
    -Determine which transactions to undo
    -Determine which pages were dirty (disk version not up to date) at time of
    -RedoLSN: LSN from which redo should start
  • Redo pass:
    -Repeats history, redoing all actions from RedoLSN(updated committed but not written changes to pages)
    -RecLSN and PageLSNs are used to avoid redoing actions already reflected on page
  • Undo pass:
    -Rolls back all incomplete transactions (with uncommitted pages written to disk).
    -Transactions whose abort was complete earlier are not undone

Durability

在这里插入图片描述

Availability and Replication

在这里插入图片描述

NOTE:

  • active/passive
    No transactions are going to region #2. All transactions run region #1, but when transactions commite on site 1, you ship the changes over the network.

  • active/active
    Both of them are processing transactions at the same time. Ship it’s logs and changes to each other.

Isolation

Locking and Concurrency

在这里插入图片描述

Locking

在这里插入图片描述

Schedule

在这里插入图片描述

Serializable

在这里插入图片描述

Lock-Based Concurrency Control

在这里插入图片描述

Aborting a Transaction

在这里插入图片描述

MySQL (Locking) Isolation

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值