爬取的百度学术:共有37416个单元格。其中1803个无用字符,20516个唯一的名字已找到,455个存在冲突的名字(393个识别出归属,62个未识别),14702个其余节点(作者是学生或者其他老师,少数存在名字不规范)
之前提供的文档:共有5650个单元格,其中3316个无用字符,834个唯一的名字已找到,35个存在冲突的名字(均未找出),1465个其余节点(作者是学生或者其他老师,少数存在名字不规范)
总计:223个教师节点,约6000多条边(其中因名字不规范丢失的边没有统计,名字冲突无法识别的导致丢失约100条),去重后(权值为边数相加)的边数约600条
Source | Target | Value |
---|
Group | Size | Nodes |
---|
N and E are the number of nodes and links. 〈k〉 and 〈d〉 are the average degree and the average distance, respectively. C and r are the average clustering coefficient and the assortative coefficient. H is the degree heterogeneity. βc is the epidemic threshold of the SIR model.
N | 223 |
---|---|
E | 562 |
<k> | 5.0404 |
<d> | 2.2113 |
<C> | 0.3436 |
r | -0.1161 |
H | 0.5096 |
beta_c | 0.0939 |
Communities:
Modularity (Q):
Runtime (s):
AUC:
Precision:
Recall Rate:
F-value:
Accuracy: