Normalized Mutual Information 的Python 实现 (NMI.py)

hxy    2018-11-19 20:38

        NMI是Normalized Mutual Information的简称,用于比较社团划分结果与ground-truth之间的接近程度,取值范围为[0, 1],出自2006年Danon 的论文 [1]。 有两种计算方法,为了方便大家检测结果,写了一个通用的Python版计算函数,当然也可以直接调用库函数计算。代码如下:
# -*- coding: utf-8 -*-
import math

def NMI(c1, c2):
    '''
    Calculate Normalized Mutual Information (NMI) of community detection result c1 and ground_truth c2.
    Format:
        c1 = [[1, 2, 3], [4, 5, 6]]
        c2 = [[1, 2 ,3], [4, 5, 6]]
        nmi = NMI(c1, c2)
        print(nmi)
    Return: NMI value in range [0, 1]
    Reference: Danon L, Diazguilera A, Arenas A. Effect of size heterogeneity
               on community identification in complex networks[J]. Arxiv Physics, 2006, 2006(11):11010.
    '''
    nmi = 0
    u = 0
    d = 0
    n1 = 0
    n2 = 0

    for i in c1:
        n1 += len(i)
    for j in c2:
        n2 += len(j)
    assert n1 == n2, 'ERROR: Size of c1 is not equal to c2'
    
    s = n1

    for i in c1:
        ni = len(i)
        for j in c2:
            nj = len(j)
            nij = len(set(i) & set(j))
            if nij == 0:
                continue
            logt = math.log((nij * s)/(ni*nj))
            u += nij * logt
    u *= -2

    for i in c1:
        ni = len(i)
        d += ni * math.log(ni / s)
    for j in c2:
        nj = len(j)
        d += nj * math.log(nj / s)

    if d != 0:
        nmi = u / d

    return nmi


def get_ground_truth(G, key):
    '''
    根据网络节点的key值获取社团划分的标准结构
        例如:

        输入为 G, G 中包含节点的实际社团标签,如果用label表示
        G.node[1]['label'] = 1 # 表示G中的节点1属于1社团
        G.node[2]['label'] = 1 # 表示G中的节点2属于1社团
        G.node[3]['label'] = 2 # 表示G中的节点3属于1社团

        输出结果为:
        [[1,2], [3]]
    '''
    community = []
    groups = set()
    for n in G.nodes():
        groups.add(G.nodes[n][key])
    for g in groups:
        community.append([n for n in G.nodes() if G.nodes[n][key]==g])
    return community


if __name__ == '__main__':
    c1 = [[1, 2, 3], [4, 5, 6]]
    c2 = [[1, 2 ,3], [4, 5, 6]]
    nmi = NMI(c1, c2)
    print(nmi)

    c1 = [[1, 2, 3], [4, 5, 6]]
    c2 = [[1, 2],[3], [4], [5, 6]]
    nmi = NMI(c1, c2)
    print(nmi)

    c1 = [[1, 2, 3, 4, 5, 6]]
    c2 = [[1, 2],[3], [4, 5, 6]]
    nmi = NMI(c1, c2)
    print(nmi)

    c1 = [[1, 2, 3], [4, 5, 6]]
    c2 = [[1, 2],[3], [4, 5, 6], [7]]
    nmi = NMI(c1, c2)
    print(nmi)
参考资料:
  1. Danon L, Diazguilera A, Arenas A. Effect of size heterogeneity on community identification in complex networks[J]. Arxiv Physics, 2006, 2006(11):11010.
Last Modified: 2020-03-24 08:52
Views: 4.6K

[[total]] comments

Post your comment
  1. [[item.time]]
    [[item.user.username]] [[item.floor]]Floor
  2. Click to load more...
  3. Post your comment