Modularity maximization

本文探讨了网络科学中的模块化概念,一种用于衡量网络中社区结构清晰度的指标。通过比较实际网络中社区内边的数量与随机网络中预期的边数,模块化帮助我们识别网络中紧密相连的节点群组。目标是找到一种划分方式,使得模块化值达到最大,尽管这是一个计算上困难的问题,但启发式算法提供了一种在合理时间内获得较好结果的方法。

Modularity maximization

Our goal is to find a measure that quantifies how many edges lie within groups in our network relative to the number of such edges expected on the basis of chance. A good division of nodes into communities is one that maximizes such a measure. Equivalently, we want a measure that quantifies how many edges lie between groups in our network relative to the expected number of such links. A good division of nodes into communities is one that minimizes such a measure. We will concentrate on the former measure of modularity of a network.

Let us focus on undirected multi-graphs, that is, graphs that allow self-edges (edges involving the same node) and multi-edges (more than one simple edge between two vertices). A measure of modularity of a network is the number of edges that run between vertices of the same community minus the number of such edges we would expect to find if the configuration model is assumed, that is if edges were positioned at random while preserving the vertex degrees. Let us denote cic_ici the community of vertex iii and δ(ci,cj)=1\delta(c_i,c_j) = 1δ(ci,cj)=1 if ci=cjc_i = c_jci=cj and δ(ci,cj)=0\delta(c_i,c_j) = 0δ(ci,cj)=0 otherwise. Hence, the number of edges that run between vertices of the same group is:

∑(i,j)∈Eδ(ci,cj)=12∑i,jAi,jδ(ci,cj)\displaystyle{\sum_{(i,j) \in E} \delta(c_i, c_j) = \frac{1}{2} \sum_{i,j} A_{i,j} \delta(c_i, c_j) }(i,j)Eδ(ci,cj)=21i,jAi,jδ(ci,cj)
where EEE is the set of edges of the graph and Ai,jA_{i,j}Ai,j is the actual number of edges between iii and jjj, which is zero or more (notice that each undirected edge is represented by two pairs in the second sum, hence the factor one-half).
The expected number of edges that run between vertices of the same group is:

12∑i,jkikj2mδ(ci,cj)\displaystyle{\frac{1}{2} \sum_{i,j} \frac{k_i k_j}{2m} \delta(c_i, c_j) }21i,j2mkikjδ(ci,cj)
where kik_iki and kjk_jkj are the degrees of iii and jjj, while mmm is the number of edges of the graph. Notice that kikj/2mk_i k_j / 2mkikj/2m is the expected number of edges between vertices iii and jjj in the configuration model assumption. Indeed, consider a particular edge attached to vertex iii. The probability that this edge goes to node jjj is kj/2mk_j / 2mkj/2m, since the number of edges attached to jjj is kjk_jkj and the total number of edge ends in the network is 2m2m2m (the sum of all node degrees). Since node iii has kik_iki edges attached to it, the expected number of edges between iii and jjj is kikj/2mk_i k_j / 2mkikj/2m.
Hence the difference between the actual and expected number of edges connecting nodes of the same group, expressed as a fraction with respect to the total number of edges mmm, is called modularity, and given by:

Q=12m∑i,j(Ai,j−kikj2m)δ(ci,cj)=12m∑i,jBi,jδ(ci,cj)\displaystyle{Q = \frac{1}{2m} \sum_{i,j} \left(A_{i,j} - \frac{k_i k_j}{2m}\right) \delta(c_i, c_j) = \frac{1}{2m} \sum_{i,j} B_{i,j} \delta(c_i, c_j) }Q=2m1i,j(Ai,j2mkikj)δ(ci,cj)=2m1i,jBi,jδ(ci,cj)
where: Bi,j=Ai,j−kikj2m B_{i,j} = A_{i,j} - \frac{k_i k_j}{2m} Bi,j=Ai,j2mkikj and BBB is called the modularity matrix.
The modularity QQQ takes positive values if there are more edges between same-group vertices than expected, and negative values if there are less. Our goal is to find the partition of network nodes into communities such that the modularity of the division is maximum. Unfortunately, this is a computationally hard problem. It is believed that the only algorithms capable of always finding the division with maximum modularity take exponentially long to run and hence are useless for all but the smallest of networks. Instead, therefore, we turn to heuristic algorithms, algorithms that attempt to maximize the modularity in an intelligent way that gives reasonably good results in a quick time.

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值