Data Structure Chapter 6 Graph Study Notes

content

1. Figure

6.1 Definition of graph

6.2 Basic terminology of graphs

6.3 Graph storage structure

6.3.1 Lead matrix

6.3.2 Adjacency List

6.4 Traversal of graphs

6.4.1 Depth-First Search (DFS)

6.4.2 Breadth-First Search (BFS)

6.5 Applications of graphs

6.5.1 Minimum Spanning Tree

6.5.2 Shortest path

6.5.3 Topological sort

6.5.5 Critical Path Algorithm

2. Summary

1. Figure

6.1 Definition of graph

A graph G consists of two sets V and E, denoted as G=(V,E), where V is a finite non-empty set of vertices, and E is a finite set of pairs of vertices in V. These pairs of vertices are called edges . V(G) and E(G) usually represent the vertex set and edge set of graph G, respectively. E(G) can be an empty set. If E(G) is empty, then graph G has only vertices and no edges.

For a graph G, if the edge set E(G) is a set of directed edges, it becomes a directed graph; if the edge set E(G) is a set of undirected edges, the graph is called an undirected graph.

6.2 Basic terminology of graphs

1) Subgraph: Suppose there are two graphs G1=(V1,E1) and G2=(V2,E2) ifandThen G2 is called a subgraph of G1.

2) Undirected complete graph and directed complete graph: For undirected graphs, if anyIt is called an undirected complete graph. For a directed graph ifedge is called a directed complete graph.

3) Sparse graph and dense graph: A graph with few edges is called a sparse graph, and vice versa is called a dense graph.

4) Weight and network: In practical applications, each edge can be marked with a value with some meaning, called weight, and this weighted graph is called a network.

5) Adjacent points: For an undirected graph G, if the edges of the graphThe two are said to be adjacent to each other.

6) Degree, in-degree and out-degree: The degree of a vertex v refers to the number of edges associated with v, denoted as TD(v). For a directed graph, the degree of vertex v is divided into out-degree and in-degree, the in-degree is the number of arcs with the vertex v as the head and denoted as ID(v); the out-degree is the number of arcs with the vertex v as the tail, denoted as OD(v), there is TD(v)=ID(v)+OD(v).

7) Path and path length: in an undirected graph G from v toThe path of is a sequence of vertices. If G is a directed graph, the path is also directed, and the length of the path refers to the number of edges or arcs passing through the path.

8) Loops and Rings: A path where the first vertex and the last vertex are the same is called a loop or a ring.

9) Simple paths, simple loops or simple loops: The paths in which the vertices in the sequence do not repeat are called simple paths. Loops with no repeating vertices except the first and last vertices are called simple loops and simple loops.

10) Connectivity, connected graphs and connected components: In an undirected graph G, if v toIf there is a path, the two vertices are said to be connected. If any two vertices of a graph are connected, the graph is called a connected graph. A connected component refers to a maximally connected subgraph in an undirected graph.

11) Strongly connected graph and strongly connected components: In a directed graph G, if there is a path to each other for any pair of vertices in the point set, the graph is called a strongly connected graph. Strongly connected components refer to maximally connected subgraphs in a directed graph.

12) Spanning tree of connected graph: a minimal connected subgraph, which contains all the vertices in the graph, but only has n-1 edges enough to form a tree, such a connected subgraph is called a spanning tree of a connected graph.

13) Directed trees and spanning forests: A directed graph with one vertex in-degree 0 and all other vertices having in-degree 1 is called a directed tree. A generative forest of a directed graph consists of several directed trees, containing all the vertices in the graph, but only enough arcs to form several directed trees that do not want to intersect.

6.3 Graph storage structure

Since the logical structure of the graph is many-to-many, all relationships cannot be represented by sequential storage.

6.3.1 Lead matrix

It is suitable for dense graphs (sparse graphs will cause a lot of space waste), if fromarriveThere is an edge, if it is an unweighted graph, useIndicates that there is a way, and the right is to useRepresents the weight of the edge. It's intuitive but wastes storage space.

Note: For an undirected graph, we can think of it as a directed graph with two edges

6.3.2 Adjacency List

It is suitable for storage of sparse graphs and is a chain storage. It is mainly composed of header node table and edge table.

1) Header node table: It is stored in a sequential manner, and the points adjacent to the header are stored in chain storage under each header, which is composed of a data field and a chain field. The data field here stores the name of the vertex, and the chain field points to the first node of each table.

2) Edge table: The table stored under each table header is stored in a chain structure. There are mainly three domains, the adjacency point field, the data field and the chain field. The adjacency point field stores the position of the point in the graph, and the data field stores weights, etc.

Note: The result of storing the graph in the adjacency table is not unique, and the result of storing the graph in the connection matrix is ​​unique.

6.4 Traversal of graphs

6.4.1 Depth-First Search (DFS)

Simply put, it is a road to black, until it hits the bottom and then returns

put an example

code

#include <bits/stdc++.h>
#define int long long
#define endl '\n'
using namespace std;
const int N = 1e6 + 100;
const int mod = 1e9 + 7;
int ma[1005][1005];
int l[4]={1,0,-1,0};
int r[4]={0,1,0,-1};
int x2,y2;
int book[1005][1005];
int ans;
int a,b;
void dfs(int x,int y,int sum)
{
	if(x==x2&&y==y2)
	{
		ans=min(ans,sum);
		return;
	}
	for(int i=0;i<=3;i++)
	{
		if(ma[x+l[i]][y+r[i]]==0&&book[x+l[i]][y+r[i]]==0&&x+l[i]<b&&x+l[i]>=0&&y+r[i]>=0&&y+r[i]<a)
		{
			book[x+l[i]][y+r[i]]=1;
			dfs(x+l[i],y+r[i],sum+1);
			book[x+l[i]][y+r[i]]=0;
		}
	}
}
signed main()
{
	int t;
	cin>>t;
	while(t--)
	{
		ans=1e9+10;
		memset(book,0,sizeof(book));
		memset(ma,0,sizeof(ma));
		int c,x1,y1;
		cin>>a>>b>>c;
		for(int i=1;i<=c;i++)
		{
			int x,y;
			cin>>x>>y;
			ma[x][y]=1;
		}
		cin>>x1>>y1>>x2>>y2;
		book[x1][y1]=1;
		dfs(x1,y1,0);
		if(ans!=1e9+10)
		cout<<ans<<'\n';
		else
		cout<<"Not arrive"<<'\n';
	} 
}

6.4.2 Breadth-First Search (BFS)

Cast a wide net and walk through all the points that can be reached on this floor first

also put an example

code

#include<iostream>
using namespace std;
char map[1001][1001];
int n,m,sx,sy,book[1001][1001],next[4][2]={1,0,0,1,-1,0,0,-1};
struct p1141{int x,y;}p[1000001];
void bfs(int x,int y)
{
	int ans=1,h=0,t=1,tx,ty;
	p[h].x=x;
	p[h].y=y;
	book[x][y]=1;
	while(h<t)
	{
		for(int k=0;k<=3;k++)
		{
			tx=p[h].x+next[k][0];
			ty=p[h].y+next[k][1];
			if(tx<1||ty<1||tx>n||ty>n||book[tx][ty]!=0||map[p[h].x][p[h].y]==map[tx][ty]) continue;
			p[t].x=tx;
			p[t].y=ty;
			book[tx][ty]=1;
			ans++;
			t++;
		}
		h++;
	}
	for(int j=0;j<t;j++)
	    book[p[j].x][p[j].y]=ans;
	return;
}
int main()
{
	cin>>n>>m;
	for(int i=1;i<=n;i++)
		for(int j=1;j<=n;j++)
		    cin>>map[i][j];//Read the picture
	for(int i=1;i<=m;i++)
	{
	    cin>>sx>>sy;//ask
		if(book[sx][sy]!=0) cout<<book[sx][sy]<<endl;
		else 
		{
			bfs(sx,sy);
			cout<<book[sx][sy]<<endl;
		}
	}
	return 0;
}

6.5 Applications of graphs

6.5.1 Minimum Spanning Tree

Among all the spanning trees of a connected network, the spanning tree with the smallest sum of the cost of each edge is called the minimum cost spanning tree, or the minimum spanning tree for short.

An important property of minimum spanning tree MST: Assumptionsis a connected network, and U is a non-empty subset of the vertex set V. If (u, v) is an edge with the smallest weight, where, then there must be a minimum spanning tree with edges (u, v).

The following two algorithms are mainly introduced

1) prim algorithm

The essence is similar to the dijkstra algorithm. The only difference is that it is not the shortest path from the node to the starting point but the shortest path to the set. The correctness proof of this algorithm is more complicated and will not be expanded here.

int prim() {
	memset(dist,0x3f,sizeof dist);
	priority_queue<PII,vector<PII>,greater<PII>>q;
	dist[1]=0;
	int res=0;
	q.push({0,1}); 
	while(q.size()&&cnt<n) {
		auto it=q.top();
		q.pop();
		int ver=it.second,d=it.first;
		if(st[ver]) continue;
		st[ver]=1;
		res+=d;
		cnt++;
		for(int i=1;i<=n;i++) {
			if(mp[ver][i]<dist[i]) {
				dist[i]=mp[ver][i];
				q.push({mp[ver][i],i});
			}
		}
	}
	return res;
}

2) kruskal algorithm

This algorithm needs to introduce the concept of union search. This algorithm is very difficult to think. It is probably to first sort all edges according to the edge weight from small to large, and then add the edge as long as the two nodes are not in the same set.

vector<pair<int, PII>>v;
int mp[N];
int find(int no) {
	if (mp[no] == no) return no;
	else {
		mp[no] = find(mp[no]);
		return mp[no];
	}
}
int kruskal() {
	int res = 0;
	for (auto it : v) {
		if (find(mp[it.second.first]) == find(mp[it.second.second])) {
			continue;
		}
		else {
			res += it.first;
			mp[find(max(it.second.first, it.second.second))] = mp[find(min(it.second.first, it.second.second))];
			cnt++;
		}
	}
	return res;
}

Note: The complexity of the naive version of the prim algorithm is O(n^2), and the complexity of the heap-optimized version is the same as that of the kruskal algorithm, and the complexity is O(), in sparse graphs, we tend to use kruskal, and dense graphs often use prim.

6.5.2 Shortest path

Single source shortest path

If we just want to find the shortest path from one point to another point, we often use the dijkstra algorithm. Its idea is to first determine a part of the path, and then select the path with the shortest distance from the starting point among the undetermined points, and repeat it until the end point. OK, the resulting path is the shortest path.

Dijkstra's proof is more complicated and will not be explained here. In addition, it is its proof requirements that restrict dijkstra to be used in graphs without negative weights, and graphs with negative weights need to use spfa.

In the naive version of dijkstra's algorithm, the main time-consuming is to find the shortest point each time, and it happens that we can use heap optimization to reduce its complexity from O(n^2) to O()

plain code

#include <bits/stdc++.h>
#define endl '\n'
#define IOS ios::sync_with_stdio(false);cin.tie(0);cout.tie(0)
using namespace std;
const int N = 1e5 + 100;
const int mod = 1e9 + 7;
int qpow(int a, int b) {
	int ans = 1, base = a;
	while (b) {
		if (b & 1) ans = ans * base % mod;
		base = base * base % mod;
		b >>= 1;
	}
	return ans % mod;
}
int n, e, s;
vector<pair<int,int>>v[N];
map<int, int>mp;
bool vis[N];
int dist[N];
int pos;
void dijkstra() {
	memset(dist, 0x3f3f3f3f, sizeof dist);
	dist[s] = 0;
	priority_queue<pair<int, int>, vector<pair<int, int>>, greater<pair<int, int>>>h;
	h.push({0, s});
	while (!h.empty()) {
		auto i = h.top();
		h.pop();
		int ve = i.second, q = i.first;
		if (vis[ve]) continue;
		vis[ve] = 1;
		for (int j = 0; j < v[ve].size(); j++) {
			if (dist[v[ve][j].first] > q + v[ve][j].second) {
				dist[v[ve][j].first] = q + v[ve][j].second;
				h.push({dist[v[ve][j].first], v[ve][j].first});
			}
		}
	}
	if(dist[e]!=0x3f3f3f3f) cout<<dist[e];
	else cout<<-1;
}
int m;
void solve() {
	cin >> n  >> m;
	s=1;
	while (m--) {
		int a, b, k;
		cin >> a >> b >> k;
		v[a].push_back({b,k});
	}
	e=n;
	dijkstra();
}
signed main() {
	int t;
	IOS;
	t = 1;
	while (t--) {
		solve();
	}
}

Heap-optimized code

#include <bits/stdc++.h>
#define endl '\n'
#define IOS ios::sync_with_stdio(false);cin.tie(0);cout.tie(0)
using namespace std;
const int N = 1e5 + 100;
const int mod = 1e9 + 7;
int qpow(int a, int b) {
	int ans = 1, base = a;
	while (b) {
		if (b & 1) ans = ans * base % mod;
		base = base * base % mod;
		b >>= 1;
	}
	return ans % mod;
}
int n, e, s;
vector<pair<int,int>>v[N];
map<int, int>mp;
bool vis[N];
int dist[N];
int pos;
void dijkstra() {
	memset(dist, 0x3f3f3f3f, sizeof dist);
	dist[s] = 0;
	priority_queue<pair<int, int>, vector<pair<int, int>>, greater<pair<int, int>>>h;
	h.push({0, s});
	while (!h.empty()) {
		auto i = h.top();
		h.pop();
		int ve = i.second, q = i.first;
		if (vis[ve]) continue;
		vis[ve] = 1;
		for (int j = 0; j < v[ve].size(); j++) {
			if (dist[v[ve][j].first] > q + v[ve][j].second) {
				dist[v[ve][j].first] = q + v[ve][j].second;
				h.push({dist[v[ve][j].first], v[ve][j].first});
			}
		}
	}
	if(dist[e]!=0x3f3f3f3f) cout<<dist[e];
	else cout<<-1;
}
int m;
void solve() {
	cin >> n  >> m;
	s=1;
	while (m--) {
		int a, b, k;
		cin >> a >> b >> k;
		v[a].push_back({b,k});
	}
	e=n;
	dijkstra();
}
signed main() {
	int t;
	IOS;
	t = 1;
	while (t--) {
		solve();
	}
}

Multi-source shortest path

Of course, the multi-source shortest path can also use the n-time dijkstra algorithm, but the amount of code is obviously large, so there is the floyd algorithm. The idea of ​​this algorithm is to list the state transition equation based on dp.

#include <iostream>
using namespace std;

const int N = 210, M = 2e+10, INF = 1e9;

int n, m, k, x, y, z;
int d[N][N];

void floyd() {
    for(int k = 1; k <= n; k++)
        for(int i = 1; i <= n; i++)
            for(int j = 1; j <= n; j++)
                d[i][j] = min(d[i][j], d[i][k] + d[k][j]);
}

int main() {
    cin >> n >> m >> k;
    for(int i = 1; i <= n; i++)
        for(int j = 1; j <= n; j++)
            if(i == j) d[i][j] = 0;
            else d[i][j] = INF;
    while(m--) {
        cin >> x >> y >> z;
        d[x][y] = min(d[x][y], z);
        //Take care to save the smallest edge
    }
    floyd();
    while(k--) {
        cin >> x >> y;
        if(d[x][y] > INF/2) puts("impossible");//Because of the existence of negative weight edges, it is reasonable to be larger than INF/2
        else cout << d[x][y] << endl;
    }
    return 0;
}

The time complexity of this algorithm is O(n^3), the advantage is that the code is short, and all the shortest paths can be found in one go

6.5.3 Topological sort

Topological sorting is an algorithm that is particularly useful for finding whether there are cycles in a graph. In books, we often apply topological sorting to sorting the order of projects (first create a map based on the sequence of projects), and then delete the in-degree cyclically as For the project of 0, the discharged sequence is the order that the project should follow. We call this practice AOV-Net (Activity On Vertex Network).

#include<bits/stdc++.h>
#define int long long
#define endl '\n'
#define IOS ios::sync_with_stdio(0),cin.tie(0),cout.tie(0)
using namespace std;
const int INF = 0x3f3f3f3f3f3f3f3f;
const int N = 1e6 + 10;
const int mod = 1e9 + 7;
typedef pair<int, int> PII;

int n,m;
int in[N];
bool st[N];
int cnt;
vector<int>v[N];
queue<int>res;
bool top_sort() {
	queue<int>q;
	for(int i=1;i<=n;i++) {
		if(in[i]==0) {
			q.push(i);
			st[i]=1;
		}
	}
	while(q.size()) {
		int ver=q.front();
		cnt++;
		res.push(ver);
		q.pop();
		for(auto it:v[ver]) {
			if(!st[it]) {
				in[it]--;
				if(in[it]==0) {
					q.push(it);
					st[it]=1;
				}
			}
		}
	}
	if(cnt==n) return 0;
	else return 1;
}
void solve() {
	cin>>n>>m;
	for(int i=1;i<=m;i++) {
		int x,y;
		cin>>x>>y;
		in[y]++;
		v[x].push_back(y);
	}
	if(top_sort()) {
		cout<<-1<<endl;
	}
	else {
		while(res.size()) {
			cout<<res.front()<<' ';
			res.pop();
		}
	} 
}
signed main() {
	IOS;
	int t = 1;
//	cin >> t;
	for (int i = 1; i <= t; i++) {
		solve();
	}
}

The time complexity of the algorithm is O(n+m)

6.5.5 Critical Path Algorithm

Also known as AOE-Net (Activity On Edge), it is actually an application of topology sorting. According to the topology sequence, the earliest completion time of the project is obtained, and then the latest completion time of the project is obtained according to the inverse topology sequence. Obviously, when the earliest and latest completion times are the same, the project is the key that will affect the entire project (because the earliest and latest are not floating, indicating that the project cannot be delayed).

2. Summary

The main emphasis is on the comparison between the adjacency list and the adjacency matrix.

adjacency matrixadjacency list
Undirected graphdirected graphUndirected graphdirected graph
spaceAdjacency matrix symmetry can be compressed tounitsAdjacency matrix asymmetric storageunitsStore n+2m cellsstore n+m cells
timefind a vertexdegreeScan the row corresponding to i in the adjacency matrix, O(n)Out/in degree: scan the row/column corresponding to i in the adjacency matrix, O(n)scanningThe edge list of , worst O(n)

Out Degree: ScanThe edge list of , worst O(n)

In-degree: scan all edge lists, worst O(n+m)

find the number of sidesScan the adjacency matrixScan all edge tables by vertex table O(n+2m)Scan all edge tables by vertex table O(n+m)
decision edgedoes it existCheck the value of A[i][j] directly O(1)scanningEdge list, worst O(n)
Applicabledense graphsparse graph

 

 

Tags: C++ data structure Graph Theory

Posted by Ramtree on Wed, 18 May 2022 01:04:31 +0300