Degree and Closeness Centrality


活动地址:优快云21天学习挑战赛

The main Question

How to find important nodes in a network?

image-20220811181854797

Node Importance

  • Degree

  • Average proximity to other nodes

    与其它结点的平均接近度

  • Fraction of shortest paths that pass through node

    通过特定节点的最短路径的比例

at this point, these three definitions of importance are very informal and the goal of this video and the following videos is going to be to get a more precise definitions of how to measure importance in a network.

Network Centrality

more generally these measure **allows us to find nodes that prevent the network from breaking up. **

Centrality Measures:

  • Degree centrality
  • Closeness centrality
  • Betweenness centrality
  • Load centrality
  • Page Rank
  • Katz centrality
  • Percolation centrality

Degree centrality

Assumption:important nodes have many connections.

The most basic measure of centrality: number of neighbors

  • Undirected networks: use degree

C d e g ( v ) = d v ∣ N ∣ − 1 N : 总节点数 d v : 节点 v 的度数 C_{deg}(v)=\frac{d_v}{|N|-1}\\ N:总节点数\\ d_v:节点v的度数 Cdeg(v)=N1dvN:总节点数dv:节点v的度数

取值范围为 [ 0 , 1 ] [0,1] [0,1]

当节点V是孤立节点时,该值为0

当节点v与其他所有节点都有连接时,该值为1

import networkx as nx

G=nx.karate_club_graph()
G=nx.convert_node_labels_to_integers(G,first_label=1)

degCent=nx.degree_centrality(G)
print(type(degCent))
#>> <class 'dict'>返回值是个字典

print(degCent[34])
#>>0.5151515151515151  17/33

print(degCent[33])
#>> 0.36363636363636365 16/33
  • Directed networks: use in-degree or out-degree
image-20220811191126293

C i n d e g ( v ) = d v i n ∣ N ∣ − 1 N : 总节点数 d v i n : 节点 v 的入度 C_{indeg}(v)=\frac{d_v^{in}}{|N|-1}\\ N:总节点数\\ d_v^{in}:节点v的入度 Cindeg(v)=N1dvinN:总节点数dvin:节点v的入度

indegCent=nx.in_degree_centrality(G)
indegCent['A']
indegCent['C']

C o u t d e g ( v ) = d v o u t ∣ N ∣ − 1 N : 总节点数 d v o u t : 节点 v 的入度 C_{outdeg}(v)=\frac{d_v^{out}}{|N|-1}\\ N:总节点数\\ d_v^{out}:节点v的入度 Coutdeg(v)=N1dvoutN:总节点数dvout:节点v的入度

indegCent=nx.out_degree_centrality(G)
indegCent['A']
indegCent['C']

Closeness Centrality

Assumption: important nodes are close to other nodes
C c l o s e ( v ) = ∣ N ∣ − 1 ∑ u ∈ N d ( v , u ) N : 图的总结点数 d ( v , u ) : v 到 u 的最短距离 C_{close}(v)=\frac{|N|-1}{\sum_{u\in N}d(v,u)}\\ N:图的总结点数\\ d(v,u):v到u的最短距离 Cclose(v)=uNd(v,u)N1N:图的总结点数d(v,u):vu的最短距离

closeCent=nx.closeness_centrality(G)
print(type(closeCent))
#<class 'dict'>
print(closeCent[32])
#>>0.5409836065573771

print(sum(nx.shortest_path_length(G,32).values()))
#>>61

print((len(G.nodes())-1)/61)
#>>0.5409836065573771

Disconnected Nodes

How to measure the closeness centrality of a node when it cannot reach all other nodes?

Option 1

Only consider nodes that L can reach:
C c l o s e ( L ) = ∣ R ( L ) ∣ ∑ u ∈ R ( L ) d ( L , u ) R ( L ) : L 可达的节点集合 C_{close}(L)=\frac{|R(L)|}{\sum_{u\in R(L)d(L,u)}}\\ R(L):L可达的节点集合 Cclose(L)=uR(L)d(L,u)R(L)R(L):L可达的节点集合
看回有向图,L只能到达M点
C c l o s e ( L ) = 1 1 = 1 C_{close}(L)=\frac{1}{1}=1\\ Cclose(L)=11=1
Problem:centrality of 1 is too high for a node than can only reach other node!

Option 2

Consider only nodes that L can reach and normalize by the fraction of nodes L can reach:
C c l o s e ( L ) = [ ∣ R ( L ) ∣ ∣ N − 1 ∣ ] ∣ R ( L ) ∣ ∑ u ∈ R ( L ) d ( L , u ) C_{close}(L)=[\frac{|R(L)|}{|N-1|}]\frac{|R(L)|}{\sum_{u\in R(L)d(L,u)}}\\ Cclose(L)=[N1∣R(L)]uR(L)d(L,u)R(L)

C c l o s e ( L ) = [ 1 14 ] 1 1 = 0.071 C_{close}(L)=[\frac{1}{14}]\frac{1}{1}=0.071 Cclose(L)=[141]11=0.071

One thing to note here is that in this new definition when we’re normalizing

如果图本身时完全强连通的,我们不需要对原本的定义进行规范化

但如果图中,存在多个连通分量,或者有向图不是强连通图,就需要规范化

closeCent=nx.closeness_centrality(G,wf_improved=True)
closeCent=nx.closeness_centrality(G,wf_improved=False)
# Compute centrality measures important degree_centrality = nx.degree_centrality(G) betweenness_centrality = nx.betweenness_centrality(G) closeness_centrality = nx.closeness_centrality(G) eigenvector_centrality = nx.eigenvector_centrality(G, max_iter=1000) # Convert to DataFrame for analysis centrality_df = pd.DataFrame({ "Node": list(G.nodes), "Degree Centrality": [degree_centrality[node] for node in G.nodes], "Betweenness Centrality": [betweenness_centrality[node] for node in G.nodes], "Closeness Centrality": [closeness_centrality[node] for node in G.nodes], "Eigenvector Centrality": [eigenvector_centrality[node] for node in G.nodes] }) # Sort by Degree Centrality centrality_df = centrality_df.sort_values(by="Degree Centrality", ascending=False) # Top influencers by Degree Centrality (most direct connections) top_degree = centrality_df.sort_values(by="Degree Centrality", ascending=False).head(10) # Top influencers by Betweenness Centrality (most control over shortest paths) top_betweenness = centrality_df.sort_values(by="Betweenness Centrality", ascending=False).head(10) # Top influencers by Closeness Centrality (shortest average distance to all nodes) top_closeness = centrality_df.sort_values(by="Closeness Centrality", ascending=False).head(10) # Top influencers by Eigenvector Centrality (importance based on connections to other important nodes) top_eigenvector = centrality_df.sort_values(by="Eigenvector Centrality", ascending=False).head(10) 调整逻辑回归正则化参数(C)哪里改
最新发布
03-10
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Caaaaaan

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值