cs224w（图机器学习）学习笔记1 Introduction and the Bowtie Structure of the Web

Churcee

免责声明：网站内容仅供个人学习记录，禁做商业用途，转载请注明出处。

版权所有 © 2017-2020 NEUSNCP个人学习笔记辽ICP备17017855号-2

cs224w（图机器学习）学习笔记1 Introduction and the Bowtie Structure of the Web

Churcee 2022年11月24日 10:13:54

课程先导
1. 证明可用方法：proof by contrapositive类比证明、by contradiction举反例、by cases举例子、by induction数学归纳法
2. 数学算数基础：
  1. 微积分calculus：\(e^x=lim_{n\to\infty}(1+x/n)^n\)，特例：\(e=lim_{n\to\infty}(1+1/n)^n,1/e=lim_{n\to\infty}(1-1/n)^n\)
  2. 概率论probablitity：条件概率conditional probability、随机变量random variables、期望和方差expectation and variance
  3. 线性代数linear algebra和矩阵运算matrix operations
Why Graph
1. 选择图的原因：图是用于描述并分析有关联/互动的实体的一种普适语言。它不将实体视为一系列孤立的点，而认为其互相之间有关系。它是一种很好的描述领域知识的方式。
2. 网络与图的分类
  1. networks / natural graphs：自然表示为图
    1. Social networks: Society is a collection of 7+ billion individuals
    2. Communication and transactions: Electronic devices, phone calls, financial transactions
    3. Biomedicine: Interactions between genes/proteins regulate life（大概是基因或蛋白质之间互动从而调节生理活动的过程）
    4. Brain connections: Our thoughts are hidden in the connections between billions of neurons
  2. graphs：作为一种表示方法
    1. Information/knowledge are organized and linked
    2. Software can be represented as a graph
    3. Similarity networks: Connect similar data points
    4. Relational structures: Molecules, Scene graphs, 3D shapes, Particle-based physics simulations
  3. 有时network和graph之间的差别是模糊的
  4. 复杂领域会有丰富的关系结构，可以被表示为关系图relational graph，通过显式地建模关系，可以获得更好的表现
  5. 但是现代深度学习工具常用于建模简单的序列sequence（如文本、语音等具有线性结构的数据）和grid（图片具有平移不变性，可以被表示为fixed size grids或fixed size standards），这些传统工具很难用于图的建模，其难点在于网络的复杂：
    1. 任意大小和复杂拓扑Arbitrary size and complex topological structure (i.e.没有空间局部性no spatial locality like grids)
    2. 没有基准点，没有节点固定的顺序，没有那种上下左右的方向
    3. 网络是动态的并且具有多功能dynamic and have multi-model features
main qusetion:how do we take advantage of relational structure for better prediction我们如何用这种结构优势作出更好更准确的预测——将神经网络模型使用范围扩展到图上（嵌入）

machine learning with graphs
有监督学习流程
1. 在传统机器学习流程中，我们需要对原始数据进行特征工程feature engineering（比如手动提取特征等），但是现在我们使用表征学习representation learning的方式来自动学习到数据的特征，直接应用于下游机器学习带来更好的预测。
  
  superviesd machine learning lifecycle
2. 图的表示学习：大致来说就是将原始的节点（或链接、或图）表示为向量（嵌入embedding），图中相似的节点会被embed得靠近（指同一实体，在节点空间上相似，在向量空间上就也应当相似）
  
  嵌入式机器学习
课程聚焦——图结构数据的机器学习和表征学习machine learning and the representation learning for graph structure data
1. Traditional methods: Graphlets, Graph Kernels
2. Methods for node embeddings节点嵌入: DeepWalk, Node2Vec
3. Graph Neural Networks: 卷积神经网络GCN, GraphSAGE, GAT(graph attention network), Theory of GNNs
4. Knowledge graphs and reasoning: TransE, BetaE
5. Deep generative models for graphs
6. Applications to Biomedicine, Science, Industry as well.