03-搭建图的框架

选择背景色：黄橙洋红淡粉水蓝草绿白色选择字体：宋体黑体微软雅黑楷体选择字体大小：小中大特恢复默认

4.2　搭建图的框架

Python可以用多种不同的风格进行编程，但从本质上说，Python是一种面向对象的编程语言。本节将定义两种不同类型的图：无权图（unweighted graph）和加权图（weighted graph）。本章稍后会讨论加权图，即为每条边关联一个权重（即读数，如示例中的长度）。

这里将采用继承模型，其为Python面向对象的类层次结构之基础，因此不用重复编写代码。本数据模型中的加权类将是对应无权类的子类，这样无权类的大部分功能就能得以继承，只要稍加调整就能让加权图与无权图有所区别了。

这个图的框架应该尽可能保持灵活，以便能尽可能多地表示各种不同的问题。为了实现这一目标，这里将用泛型抽象出顶点类型。每个顶点最终都会被赋予一个整数索引，但将被存储为用户定义的泛型类型。

下面就从定义 Edge 类开始搭建框架，该类是此图框架中最简单的部分。具体代码如代码清单4-1所示。

代码清单4-1　edge.py

from __future__ import annotations
from dataclasses import dataclass
@dataclass
class Edge:
    u: int # the "from" vertex
    v: int # the "to" vertex
    def reversed(self) -> Edge:
        return Edge(self.v, self.u)
    def __str__(self) -> str:
        return f"{self.u} -> {self.v}"

Edge 被定义为两个顶点之间的连接，每个顶点由整数索引表示。这里按惯例用 u 表示第1个顶点， v 表示第2个顶点。也可以将 u 视为“起点”而 v 视为“终点”。本章仅处理无向图（图的边允许双向行进），但是在有向图中，边也可以是单向的。 reversed() 方法应该返回与当前边反向的 Edge 。

注意　 Edge 类用到了Python 3.7中的新特性dataclass。标有装饰器 @dataclass 的类通过自动创建 __init__() 方法来保存一些零碎数据，该方法将会实例化类中所有声明时带有类型注解（type annotation）的变量。dataclass特性还可以自动为类创建其他的特殊方法。可以用装饰器配置需要自动创建的特殊方法。要获得详细信息，请参阅Python的dataclass文档。简而言之，采用dataclass特性可以节省一些录入的时间。

Graph 类重点关注图的基本用途：将顶点与边关联起来。同样，顶点的实际类型仍然应该是使用框架的用户所期望的任意类型，这样无须构建把各种类型的数据聚在一起的中间数据结构，就能让本框架应用于大量不同的问题。例如，在类似于超级高铁线路的图中，顶点的类型可以定义为 str ，因为我们会用到“New York”和“Los Angeles”这种字符串作为顶点。下面开始介绍 Graph 类，具体代码如代码清单4-2所示。

代码清单4-2　graph.py

from typing import TypeVar, Generic, List, Optional
from edge import Edge
V = TypeVar('V') # type of the vertices in the graph
class Graph(Generic[V]):
    def __init__(self, vertices: List[V] = []) -> None:
        self._vertices: List[V] = vertices
        self._edges: List[List[Edge]] = [[] for _ in vertices]

列表 _vertices 是 Graph 类的核心。每个顶点都会存储于该列表中，稍后我们会通过它们在列表中的整数索引来引用它们。顶点本身可以是复杂的数据类型，但它的索引肯定是 int 类型，以便于使用。从另一个层面来说，通过在图算法和 _vertices 数组之间架设的这个索引，同一张图中可以出现两个相同值的顶点。不妨想象一张以某国城市作为顶点的图，该国有多个名为“Springfield”的城市。即便顶点的值相同，它们也可以有不同的整数索引。

图的数据结构可以有多种实现方案，最常见的两种就是采用顶点矩阵（vertex matrix）或邻接表（adjacency list）。在顶点矩阵中，矩阵的每个元素表示图中两个顶点是否相连，元素的值表示顶点间的连通度（或无连接）。此处图的数据结构采用邻接表方案。在这种图表示方式中，每个顶点都有一个与其连接的顶点列表。这里采用由边的列表组成的列表，因此每个顶点都带有一个多条边组成的列表，顶点通过该列表与其他顶点相连， _edges 就是这个列表的列表。

下面将给出 Graph 类的其余部分。请注意这里的方法都很简短，大部分都只有一行代码，并且都带有详细而清晰的方法名称，这使得 Graph 类的其余部分应该在很大程度上做到了不言自明，不过为了彻底消除误解，还是加上了简短的注释。具体代码如代码清单4-3所示。

代码清单4-3　graph.py（续）

@property
def vertex_count(self) -> int:
    return len(self._vertices) # Number of vertices
@property
def edge_count(self) -> int:
    return sum(map(len, self._edges)) # Number of edges
# Add a vertex to the graph and return its index
def add_vertex(self, vertex: V) -> int:
    self._vertices.append(vertex)
    self._edges.append([]) # Add empty list for containing edges
    return self.vertex_count - 1 # Return index of added vertex
# This is an undirected graph,
# so we always add edges in both directions
def add_edge(self, edge: Edge) -> None:
    self._edges[edge.u].append(edge)
    self._edges[edge.v].append(edge.reversed())
# Add an edge using vertex indices (convenience method)
def add_edge_by_indices(self, u: int, v: int) -> None:
    edge: Edge = Edge(u, v)
    self.add_edge(edge)
# Add an edge by looking up vertex indices (convenience method)
def add_edge_by_vertices(self, first: V, second: V) -> None:
    u: int = self._vertices.index(first)
    v: int = self._vertices.index(second)
    self.add_edge_by_indices(u, v)
# Find the vertex at a specific index
def vertex_at(self, index: int) -> V:
    return self._vertices[index]
# Find the index of a vertex in the graph
def index_of(self, vertex: V) -> int:
    return self._vertices.index(vertex)
# Find the vertices that a vertex at some index is connected to
def neighbors_for_index(self, index: int) -> List[V]:
    return list(map(self.vertex_at, [e.v for e in self._edges[index]]))
# Look up a vertice's index and find its neighbors (convenience method)
def neighbors_for_vertex(self, vertex: V) -> List[V]:
    return self.neighbors_for_index(self.index_of(vertex))
# Return all of the edges associated with a vertex at some index
def edges_for_index(self, index: int) -> List[Edge]:
    return self._edges[index]
# Look up the index of a vertex and return its edges (convenience method)
def edges_for_vertex(self, vertex: V) -> List[Edge]:
    return self.edges_for_index(self.index_of(vertex))
# Make it easy to pretty-print a Graph
def __str__(self) -> str:
    desc: str = ""
    for i in range(self.vertex_count):
        desc += f"{self.vertex_at(i)} -> {self.neighbors_for_index(i)}\n"
    return desc

不妨回头看一下，为什么这个类的大多数方法有两个版本呢？从类的定义可以得知， _vertices 是由 V 类型的元素构成的列表， V 可以是任意Python类。于是就有 V 类型的顶点存储在 _vertices 列表中。但在后续要检索或操作这些顶点的时候，就需要知道它们在该列表中的存储位置。因此，每个顶点在数组中都有一个与之关联的索引（整数）。如果我们不知道顶点的索引，就需要遍历 _vertices 来查找。这就是每种方法都有两个版本的原因。一个是基于 int 索引进行操作，另一个是基于 V 本身进行操作。基于 V 操作的方法会搜索其关联的索引并调用基于索引的函数。因此，基于 V 的方法可被视为快捷方法。

大多数方法都是无须解释的，但 neighbors_for_index() 值得做一些解析，它会返回顶点的所有邻居。顶点的邻居是指通过某条边直接连接到该顶点的所有其他顶点。例如，在图4-2中，New York和Washington是Philadelphia的唯一的共同邻居。通过查看由某个顶点发出的所有边的末端（ v ），就能找到该顶点的所有邻居。

def neighbors_for_index(self, index: int) -> List[V]:
    return list(map(self.vertex_at, [e.v for e in self._edges[index]]))

_edges[index] 就是邻接表，当前顶点通过该列表中的边与其他顶点相连。在传递给 map() 调用的列表推导式中， e 代表某条边， e.v 代表该边所连接的邻居的索引。 map() 将返回所有顶点对象（而不仅仅是它们的索引），因为 map() 对每个 e.v 都会调用 vertex_at() 方法。

还有一个重点需要注意，就是 add_edge() 的工作方式。 add_edge() 首先把某条边添加到“起点”顶点（ u ）的邻接表中，然后将这条边的逆向边添加到“终点”顶点（ v ）的邻接表中。因为该图是无向图，所以这里的第2步是必需的。我们希望对每条边都添加两个方向，这意味着 u 是 v 的邻居，同样 v 也是 u 的邻居。如果这有助于记住每条边都能双向通行，那么可以将无向图视为“双向”图。

def add_edge(self, edge: Edge) -> None:
    self._edges[edge.u].append(edge)
    self._edges[edge.v].append(edge.reversed())

如前所述，我们在本章中只处理无向图。除无向图和有向图之外，图还可以是无权图或加权图。加权图带有一些与其每条边关联的可供比较的值（通常为数字值）。在超级高铁网络中，可以将站点之间的距离视为权重。但这里将只处理该图的无权版。不带权的边只是两个顶点之间的连接，因此 Edge 类和 Graph 类都是无权的。换一种方式来说，在无权图中我们只知道哪些顶点是相连的，而在加权图中我们不仅知道哪些顶点是连通的，还知道这些连接的某些属性。

Previous 上一篇本节目录 Next 下一篇

03-搭建图的框架

4.2 搭建图的框架

4.2　搭建图的框架