database design的标准化

数学概念在关系模型中的应用:从数据冗余到规范化
本文介绍了关系模型中的基本数学概念,如集合、交叉乘积、关系、函数依赖等,讨论了如何通过规范化减少数据冗余,以及1NF、2NF和3NF的转换过程,以确保数据的一致性和完整性。

Mathematical Notions Behind Relational Model • Set – a collection of objects characterized by some defining property — e.g., a column in a database table such as last names of all staff • Cross Product of sets – one of the operations (×) on sets — e.g., consider two sets, set of all first names and set of all last names in the staff table — fName = {Mary, David} — lName = {Howe, Ford} — fName × lName = {(Mary,Howe), (Mary,Ford), (David, Howe), (David, Ford)} • Relation – defined between two sets and is a subset of cross product between those two sets — e.g., FirstNameOf = {(Mary, Howe), (David, Ford)}关系模型有关概念 cross product互积从参与的集合中各取一个元素组成集合,将所有集合整合为大集合;关系就是大集合的子集

• The name ‘relational model’ comes from this mathematical notion of relation — Where a relation is a set (collection) of tuples that have related objects such as first name and last name of the same person — e.g., (fName, lName) is a relation • We can have relations over any number of sets — e.g., (staffNo, fName, lName, position) • In general we can denote a relation as (A,B,C,D,...,Z) where A, B, C and Z are all its attribute sets关系时具有相关对象的元组集合 其中的对象称为关系的属性集

• A function is a special kind of relation • In a relation (X, Y), if every value of X is associated with exactly one value of Y, then we say Y is a function of X — e.g., the relation (1, 2), (2, 4), (3, 6), (4, 8) is a function, Y = 2*X for 0 函数就是指一个x值只对应一个y

Functional Dependency • If Y is a function of X — Y is dependent on X, — there is a relationship of functional dependency between Y and X • In databases, we work with relations in general form (A, B, C, D,... , Z) • Functional Dependency — Describes relationship between attributes in a relation — If A and B are attributes of relation R, B is functionally dependent on A, if each value of A in R is associated with exactly one value of B in R • We are interested in finding such functional dependencies among database relations 如果y是x的函数 则可以说y依赖x 二者存在函数依赖关系  在关系R中,A和B是R的属性且A的每个值都对应B的一个值,则可以说B函数依赖于A

• Is a property of the meaning (or semantics) of the attributes in a relation • Diagrammatic representation: • Determinant of a functional dependency refers to attribute or group of attributes on left-hand side of the arrow • If the determinant can maintain the functional dependency with a minimum number of attributes, then we call it full functional dependency 函数依赖是关系中属性意义的一种属性 图标左侧为被依赖属性 当以最少的属性数维持函数依赖关系 称之为完全依赖关系66d27aff033a4b6093c82929b615beba.png

Data Redundancy • Major aim of relational database design is — to group attributes into relations to minimize data redundancy and — to reduce file storage space required by base relations • Data redundancy is undesirable because of the following anomalies — ‘Insert’ anomalies — ‘Delete’ anomalies — ‘Update’ anomalies • We illustrate these anomalies with an example 数据冗余 常存在于以下情况 插入异常 删除异常 更新异常69f52138079245169bb95100ef60a3b9.png

• Insert anomalies — Try to insert details for a new member of staff into StaffBranch — You also need to insert branch details that are consistent with existing details for the same branch — Hard to maintain data consistency with StaffBranch • Delete anomalies — Try to delete details for a member of staff from StaffBranch — You also loose branch details in that tuple (row) • Update anomalies — Try to update the value of one of the attributes of a branch — You also need to update that information in all the tuples about the same branch 插入异常 保持分支数据一致性;删除异常 丢失该图分支详细信息‘;更新异常 更新同一分支所有图信息

Decomposition of Relations • Staff and Branch relations which are obtained by decomposing StaffBranch do not suffer from these anomalies • Two important properties of decomposition — Lossless-join property enables us to find any instance of original relation from corresponding instances in the smaller relations — Dependency preservation property enables us to enforce a constraint on original relation by enforcing some constraint on each of the smaller relations 分解关系没有上述的反常现象 且分解具有两个属性 无损链接属性可以从小关系的相应实例找到原始关系的所有实例;依赖保留属性可以通过小关系中强制执行某种约束进而执行原始关系的约束

The Process of Normalization • Formal technique for analyzing a relation based on its primary key and functional dependencies between its attributes • Often executed as a series of steps. Each step corresponds to a specific normal form, which has known properties • As normalization proceeds, relations become progressively more restricted (stronger) in format and also less vulnerable to update anomalies • Given a relation, use the following cycle — Find out what normal form it is in — Transform the relation to the next higher form by decomposing it to form simpler relations — You may need to refine the relation further if decomposition resulted in undesirable properties 规范化流程是根据关系的主键和属性间函数依赖关系来分析关系的级数 通常以步骤执行,每一步对应特定的具有已知属性正则表达式,随着规范化进行,关系会被进一步受限,也就更不容易受到更新异常的影响 整个流程包括找出关系的正则表达式,分解关系,如果分解产生预想外的属性,进一步完善关系

Unnormalized Form (UNF) • A table that contains one or more repeating groups • To create an unnormalized table: — transform data from information source (e.g. form) into table format with columns and rows 非规范化表格 包括一个或多个重复组的表格 需要将信息源的数据转换为行列的表格e85c6110d66149819e0b33b6eafb4042.png

First Normal Form (1NF) • A relation in which intersection of each row and column contains one and only one value • UNF to 1NF — Nominate an attribute or group of attributes to act as the key for the unnormalized table — Identify repeating group(s) in unnormalized table which repeats for the key attribute(s) 第一正则表达式即每行每列的交叉点只包含一个值的关系  可以通过指定一个或一组属性作为非正则表格的主键(类似概念 确定非正则表格中重复键的主键并删除 将其转化为第一正则表达式

• Remove repeating group by: — entering appropriate data into the empty columns of rows containing repeating data (‘flattening’ the table) Or by — placing repeating data along with copy of the original key attribute(s) into a separate relation 删去重复组可以在包含重复数据的行中空列输入适当的数据(扁平化表格) 或者将重复数据和原始主键放入一个单独的关系

• Table structure has been changed • Data related to client name repeateddb35e885f5d0498db0d9b8e134722937.png

• Remove the repeating group (property rented details) by placing the repeating data along with a copy of the original key attribute (clientNo) in a separate relation02dba161386e453e957db39c0e08002f.png

• Resulting two 1NF relations are as follows: — Client (clientNo, cName) — PropertyRentalOwner (clientNo, propertyNo, pAddress, rentStart, rentFinish, rent, ownerNo, oName)

Second Normal Form (2NF) • Based on concept of full functional dependency: — A and B are attributes of a relation R, — B is fully dependent on A (denoted A→B) if B is functionally dependent on A but not on any proper subset of A • 2NF – A relation that is in 1NF and every non-primary-key attribute is fully functionally dependent on the primary key 第二正则表达式 基于完全函数依赖:A和B是关系R的属性 B在函数依赖于A但不依赖于A的任何子集则B完全依赖于A即A→B。在第一正则表达式的关系中,每个非主键属性函数完全依赖于主键

• Identify primary key for the 1NF relation • Identify functional dependencies in the relation • If partial dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant • The ClientRental relation has the following functional dependencies — fd1: {clientNo, propertyNo} → {rentStart, rentFinish} (Primary key) — fd2: clientNo → cName (Partial dependency) — fd3: propertyNo → {pAddress, rent, ownerNo, oName} (Partial dependency) — fd4: ownerNo → oName (Transitive dependency) — fd5: {clientNo, rentStart} → {propertyNo, pAddress, rentFinish, rent, ownerNo, oName} (Candidate key) — fd6: {propertyNo, rentStart} → {clientNo, cName, rentFinish} (Candidate key) 从第一正则表达式到第二正则表达式 1确定第一正则表达式关系的主键 2确定关系中的函数依赖关系 3如果主键上存在部分依赖关系,则将它们和行列式一同放入新的关系来删除

• Original table decomposed into smaller tables • Each of them are in 2NF • Resulting three 2NF relations are as follows: — Client (clientNo, cName) — Rental (clientNo, propertyNo, rentStart, rentFinish) — PropertyOwner (propertyNo, pAddress, rent, ownerNo, oName)583cc38a19ac43d1a0c5e31432c21ab5.png

Third Normal Form (3NF) • Based on concept of transitive dependency: — A, B and C are attributes of a relation such that if A → B and B → C, — then C is transitively dependent on A through B. (Provided that A is not functionally dependent on B or C) • 3NF – A relation that is in 1NF and 2NF and in which no non-primary-key attribute is transitively dependent on the primary key 第三正则表达式 基于传递依赖:A B和C是关系的属性,如果A → B 和 B → C,则C通过B传递依赖于A(前提是A不函数依赖于B或C) 在第一正则表达式和第二正则表达式中的关系,没有非主键属性传递依赖于主键

2NF to 3NF • Identify the primary key in the 2NF relation • Identify functional dependencies in the relation • If transitive dependencies exist on the primary key remove them by placing them in a new relation along with copy of their determinant • The Client, Rental, and PropertyOwner relations have following functional dependencies — Client ◦ fd2: clientNo → cName (Primary key) — Rental ◦ fd1: {clientNo, propertyNo} → {rentStart, rentFinish} (Primary key) ◦ fd5′ : {clientNo, rentStart} → {propertyNo, rentFinish} (Candidate key) ◦ fd6′ : {propertyNo, rentStart} → {clientNo, rentFinish} (Candidate key) — PropertyOwner ◦ fd3: propertyNo → {pAddress, rent, ownerNo, oName} (Primary key) ◦ fd4: ownerNo → oName (Transitive dependency) 从第二正则表达式到第三正则表达式 1确定第二正则表达式关系的主键 2确定关系中的函数依赖关系 3如果主键上存在函数依赖关系,则它们和行列式放入新关系从而删除

Example 3NF • Remove this transitive dependency by creating two new relations called PropertyForRent and Owner • Each of them are in 3NF • No further transitive dependencies on the primary key • Resulting two 3NF relations are as follows: — PropertyForRent (propertyNo, pAddress, rent, ownerNo) — Owner (ownerNo, oName)f185bffb388b4b509b004553c22e165c.png

a4248d835753442d9ff94d4d79db92ab.png

 

 

 

 

 

 

 

 

 

 

 

 

 

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值