{ "cells": [ { "cell_type": "markdown", "id": "f6f3e475", "metadata": {}, "source": [ "# Facebook 网络分析\n", "本笔记本主要使用 NetworkX 库进行社交网络分析。具体来说,将研究十个人的 Facebook 圈子(好友列表),并从中提取各种有价值的信息。数据集可在此链接找到:[斯坦福 Facebook 数据集](http://snap.stanford.edu/data/ego-Facebook.html)。此外,众所周知,Facebook 网络是无向的,且没有权重,因为一个用户只能与另一个用户成为一次好友。从图分析的角度来看:\n", "* 每个节点代表一个匿名的 Facebook 用户,该用户属于这十个好友列表之一。\n", "* 每条边对应于属于该网络的两个 Facebook 用户之间的友谊。换句话说,两个用户必须在 Facebook 上成为好友,才能在特定网络中连接。\n", "\n", "注意:节点 $0, 107, 348, 414, 686, 698, 1684, 1912, 3437, 3980$ 是将被研究的好友列表节点。这意味着它们是本次分析的重点。这些节点被视为 `重点节点`。" ] }, { "cell_type": "markdown", "id": "f400971b", "metadata": {}, "source": [ "## 导入包" ] }, { "cell_type": "code", "execution_count": 1, "id": "a17d6843", "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import networkx as nx\n", "import matplotlib.pyplot as plt\n", "from random import randint\n", "\n", "%matplotlib inline" ] }, { "cell_type": "markdown", "id": "0bbe7cd2", "metadata": {}, "source": [ "## 分析\n", "从 `data` 文件夹加载边数据,并保存到一个数据框中。每条边对应一行,每条边包含 `start_node` 和 `end_node` 两列。" ] }, { "cell_type": "code", "execution_count": 2, "id": "63c37e74", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | start_node | \n", "end_node | \n", "
---|---|---|
0 | \n", "0 | \n", "1 | \n", "
1 | \n", "0 | \n", "2 | \n", "
2 | \n", "0 | \n", "3 | \n", "
3 | \n", "0 | \n", "4 | \n", "
4 | \n", "0 | \n", "5 | \n", "
... | \n", "... | \n", "... | \n", "
88229 | \n", "4026 | \n", "4030 | \n", "
88230 | \n", "4027 | \n", "4031 | \n", "
88231 | \n", "4027 | \n", "4032 | \n", "
88232 | \n", "4027 | \n", "4038 | \n", "
88233 | \n", "4031 | \n", "4038 | \n", "
88234 rows × 2 columns
\n", "