Normalization – NF1, NF2, NF3. How to normalize your database?

本文深入浅出地介绍了数据库规范化的三个阶段:第一范式、第二范式和第三范式,并通过实例展示了如何消除数据冗余,提高数据清晰度,从而提升数据库性能。

What is normalization?

Database normalization is the process of organizing the columns and tables of a relational database to minimize data redundancy.

And another definition:

Normalization splits a large table into smaller tables and define relationships between them to increases the clarity in organizing data.

In general I like to see tables that have one purpose.  Having the table serve many purposes introduces many of the challenges like: data duplication, data update issues, and increased effort to query data.

database table purpose

Single purpose and relations instead of mixed up table

Why normalize?

If you have big database that is not normalized, people working with it most likely will face lots of problems. Performance will be bad, there are high chances of having multiple rows for one object, then, some day one row will change, and then you have a conflict.

Long story short – successfully normalized databases will increase performance and reduce storage.

So this is our table, let’s say we need some kind of database for an agency.

 

EmployeeIdPersonPhone
1John Jones27716602, 26153215,
2Peter Janes47716602, 36153215, 52321262
3Michael Juines17716602

 

We are storing our employee id, the name and their phone numbers.

 

Common misconception: Our database structure heavily depends on the required functionality. For example. If we always use a full name of the person, it is not necessary to split the “Person” column into first name and last name columns.

About phone numbers, if we know, that there is a limit, for example: work phone, personal mobile phone and fax these three columns wouldn’t count as duplicative columns and would still pass first normal form requirements.

 

To clarify, let’s assume that we need this functionality for the finished product:

  • We are going to send personalized letters. (e.g. Hello John, how are you doing?)
  • People can only add their mobile phone number.
  • There is no limit of the mobile phone numbers they can add.

 

1NF

Firstnormal formsets the very basic rules for an organized database:

  • Each column is unique
  • Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).

Requirements for 1NF

 

  • Table must have at least one candidate key (that all attributes depend on)
  • Every column stores atomic values (example, if you use full name all the time it makes sense to store full name, if you will use first name and last name separately – you should store name and last name separately)
  • Not a repeating group (multiple phone numbers in one row) – row that stores multiple similar values instead of one

 

EmployeeIdPersonPhone
1John Jones27716602
1John Jones26153215
2Peter Janes47716602
2Peter Janes36153215
2Peter Janes52321262
3Michael Juines17716602

 

Now first normal form is satisfied, as the columns in this table hold just one value.

 

2 Normal Form

Second normal form (2NF) further addresses the concept of removing duplicative data:

  • Meet all the requirements of the first normal form.
  • Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
  • Create relationships between these new tables and their predecessors through the use of foreign keys

Requirements for 2NF

  • Must meet all 1NF requirements
  • Non-key attributes must not depend on subset of any candidate key.

sql table example

As we can see in this example, we have a new table to store all the phones, and another table to determine who owns which phone number.

 

Employees

EmployeeIdNameLast
1JohnJones
2PeterJanes
3MichaelJuines

 

Phones

PhoneIdPhone
127716602
226153215
347716602
436153215
552321262
617716602

 

This means EmployeeId and PhoneId are both Primary keys of the table? Not exactly, they are composite primary keys referring to entries in other two tables.

SQL code to create that table would look like this.

Transact-SQL

Create table EmployeesPhones (    CustomerID int references Employees(EmployeeId) not null,    ProductID int references Phones(PhoneId) not null    Primary key (CustomerID, ProductID) )

1

2

3

4

5

6

Create table EmployeesPhones

(

   CustomerID int references Employees(EmployeeId) not null,

   ProductID int references Phones(PhoneId) not null

   Primary key (CustomerID, ProductID)

)

 

 

EmployeesPhones

EmployeeIdPhoneId
11
12
23
24
25
36

 

Third Normal Form (3NF)

Third normal form (3NF) goes one large step further:

 

  • Meet all the requirements of the second normal form.
  • Remove columns that are not dependent upon the primary key.

 

For third normal form, I adjusted the table a little bit. I added Zip code and city columns.

 

We can still save a little bit of space, ZIP code determines a city, meaning that many people can live in the same city and have the same ZIP code.

 

We find that in the above Employees table, EmployeeId is the key and only prime key attribute. We find that City can be identified by EmployeeId as well as Zip itself. Neither Zip is a superkey nor is City a prime attribute. In this case –   transitive dependency exists.

 

Employees

EmployeeIdNameLastZIPCity
1JohnJones8000Aarhus C
2PeterJanes8200Aarhus N
3MichaelJuines8520Lystrup

 

To bring this table into third normal form, we break the table into two tables.

Employees

EmployeeIdNameLastZIP
1JohnJones8000
2PeterJanes8200
3MichaelJuines8520

Zip

Name?City?Zip?
JohnAarhus C8000
RickAarhus N8200
CoralLystrup8520
CarlAarhus C8000
MaggieLystrup8520
DarylAarhus N8200
MichonneAarhus N8200

转载于:https://my.oschina.net/u/2275100/blog/867745

在 PyTorch 中,`model.eval()` 是用于将模型设置为评估模式的方法。在评估模式下,模型的行为通常是不同的,因为它主要是为了在测试数据上进行推理,而不是在训练数据上进行优化。 在评估模式下,模型会禁用一些训练相关的操作,如 dropout 和 batch normalization,同时允许使用其他优化策略,如指数平均滑动,以更好地适应测试数据。此外,评估模式下模型的前向计算速度通常更快。 在使用 `model.eval()` 方法时,需要注意以下几点: 1. 如果您使用的是 Batch Normalization 层,您需要在测试之前手动设置 `model.eval()`,以确保正确的测试结果。 2. 如果您使用的是 Dropout 层,您需要在测试之前手动设置 `model.eval()`,以确保不会在测试中随机丢弃节点。 3. 在评估模式下,模型的梯度不会被计算,因此您不能在评估模式下进行反向传播和模型训练。 下面是一个简单的示例,展示了如何使用 `model.eval()` 方法: ``` import torch.nn as nn import torch.optim as optim # 定义模型和损失函数 model = MyModel() criterion = nn.CrossEntropyLoss() # 定义优化器 optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9) # 将模型设置为训练模式 model.train() # 训练模型 for epoch in range(num_epochs): # ... # 训练过程 # ... # 将模型设置为评估模式 model.eval() # 在测试数据上进行推理 with torch.no_grad(): for data, target in test_loader: output = model(data) loss = criterion(output, target) # 处理测试结果 ```
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值