MDL-based Tree Cut Model

本文介绍了一种基于最小描述长度(MDL)原则的树切割方法,该方法避免了手动调整频率阈值的问题,并通过具体实例展示了如何计算不同树切割模型的描述长度。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

A straightforward way for determining a cut of a tree is to collapse the nodes of less frequency into its parent node. However, the method is too heuristic for it relies much on manually tuned frequency threshold. In our practice, we turn to use a theoretically well-motivated method based on the MDL (Minimum Description Length) principle. MDL is a principle of data compression and statistical estimation from information theory.

 

Table 3

Calculating the description length for the model of Figure 5.

C

BIRD

bug

bee

insect

f(C)

8

0

2

0

|C|

4

1

1

1

P(C)

0.8

0.0

0.2

0.0

P(n)

0.2

0.0

0.2

0.0

T

[BIRD, bug, bee, insect]

L(α|T)

(4-1)/2 x log 10 = 4.98

L(S|T, α)

-(2+4+2+2) x log0.2 = 23.22

 

Table 4

Description length of the five tree cut models.

T

L(α|T)

L(S|T, α)

L’(T)

[ANIMAL]

0

28.07

28.07

[BIRD, INSECT]

1.66

26.39

28.05

[BIRD, bug, bee, insect]

4.98

23.22

28.20

[swallow, crow, eagle, bird, INSECT]

6.64

22.39

29.03

[swallow, crow, eagle, bird, bug, bee, insect]

9.97

19.22

29.19

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值