本節主要介紹圖演算法中的各種最短路徑演算法,從不同的角度揭示它們的核心以及它們的異同
在前面的內容裡我們已經介紹了圖的表示方法(鄰接矩陣和“各種”鄰接表)、圖的遍歷(DFS和BFS)、圖中的一些基本演算法(基於DFS的拓撲排序和有向無環圖的強連通分量、最小生成樹的Prim和Kruskal演算法等),剩下的就是圖演算法中的各種最短路徑演算法,也就是本節的主要內容。
[The shortest path problem comes in several varieties. For example, you can find shortest paths (just like any other kinds of paths) in both directed and undirected graphs. The most important distinctions, though, stem from your starting points and destinations. Do you want to find the shortest from one node to all others (single source)? From one node to another (single pair, one to one, point to point)? From all nodes to one (single destination)? From all nodes to all others (all pairs)? Two of these—single source and all pairs—are perhaps the most important. Although we have some tricks for the single pair problem (see “Meeting in the middle” and “Knowing where you’re going,” later), there are no guarantees that will let us solve that problem any faster than the general single source problem. The single destination problem is, of course, equivalent (just flip the edges for the directed case). The all pairs problem can be tackled by using each node as a single source (and we’ll look into that), but there are special-purpose algorithms for that problem as well.]
最短路徑問題有很多的變種,比如我們是處理有向圖還是無向圖上的最短路徑問題呢?此外,各個問題之間最大的區別在於起點和終點。這個問題是從一個節點到所有其他節點的最短路徑嗎(單源最短路徑)?還是從一個節點到另一個節點的最短路徑(單對節點間最短路徑)?還是從所有其他節點到某一個節點(多源最短路徑)?還是求任何兩個節點之間的最短路徑(所有節點對最短路徑)?
其中單源最短路徑和所有節點對最短路徑是最常見的問題型別,其他問題大致可以將其轉化成這兩類問題。雖然單對節點間最短路徑問題有一些求解的技巧(“Meeting in the middle” and “Knowing where you’re going,”),但是該問題並沒有比單源最短路徑問題的解法快到哪裡去,所以單對節點間最短路徑問題可以就用單源最短路徑問題的演算法去求解;而多源點單終點的最短路徑問題可以將邊反轉過來看成是單源最短路徑問題;至於所有節點對最短路徑問題,可以對圖中的每個節點使用單源最短路徑來求解,但是對於這個問題還有一些特殊的更好的演算法可以解決。
在開始介紹各種演算法之前,作者給出了圖中的幾個重要結論或者性質,此處附上原文
assume that we start in node s and that we initialize D[s] to zero, while all other distance estimates are set to infinity. Let d(u,v) be the length of the shortest path from u to v.
• d(s,v) <= d(s,u) + W[u,v]. This is an example of the triangle inequality.
• d(s,v) <= D[v]. For v other than s, D[v] is initially infinite, and we reduce it only when we find actual shortcuts. We never “cheat,” so it remains an upper bound.
• If there is no path to node v, then relaxing will never get D[v] below infinity. That’s because we’ll never find any shortcuts to improve D[v].
• Assumeashortestpathtovisformedbyapathfromstouandanedgefromutov. Now, if D[u] is correct at any time before relaxing the edge from u to v, then D[v] is correct at all times afterward. The path defined by P[v] will also be correct.
• Let [s, a, b, … , z, v] be a shortest path from s to v. Assume all the edges (s,a), (a,b), … , (z,v) in the path have been relaxed in order. Then D[v] and P[v] will be correct. It doesn’t matter if other relax operations have been performed in between.
[最後這個是路徑鬆弛性質,也就是後面的Bellman-Ford演算法的核心]
對於單對節點間最短路徑問題,如果每條邊的權值都一樣(或者說邊一樣長)的話,使用前面的BFS就可以得到結果了(第5節遍歷中介紹了);如果圖是有向無環圖,那麼我們還可以用前面動規中的DAG最短路徑演算法來求解(第8節動態規劃中介紹了),但是,現實中的圖總是有環的,邊的權值也總是不同,而且可能有負權值,所以我們還需要其他的演算法!
首先我們來實現下之前學過的鬆弛技術relaxtion,程式碼中D儲存各個節點到源點的距離值估計(上界值),P儲存節點的最短路徑上的前驅節點,W儲存邊的權值,其中不存在的邊的權值為inf。鬆弛就是說,假設節點 u 和節點 v 事先都有一個最短距離的估計(例如測試程式碼中的7和13),如果現在要鬆弛邊(u,v),也就是對從節點 u 通過邊(u,v)到達節點 v,將這條路徑得到節點 v 的距離估計值(7+3=10)和原來的節點 v 的距離估計值(13)進行比較,如果前者更小的話,就表示我們可以放棄在這之前確定的從源點到節點 v 的最短路徑,改成從源點到節點 u,然後節點 u 再到節點 v,這條路線距離會更短些,這也就是發生了一次鬆弛!(測試程式碼中10<13,所以要進行鬆弛,此時D[v]變成10,而它的前驅節點也變成了 u)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
#relaxtion inf = float('inf') def relax(W, u, v, D, P): d = D.get(u,inf) + W[u][v] # Possible shortcut estimate if d < D.get(v,inf): # Is it really a shortcut? D[v], P[v] = d, u # Update estimate and parent return True # There was a change! #測試程式碼 u = 0; v = 1 D, W, P = {}, {u:{v:3}}, {} D[u] = 7 D[v] = 13 print D[u] # 7 print D[v] # 13 print W[u][v] # 3 relax(W, u, v, D, P) # True print D[v] # 10 D[v] = 8 relax(W, u, v, D, P) print D[v] # 8 |
顯然,如果你隨機地對邊進行鬆弛,那麼與該邊有關的節點的距離估計值就會慢慢地變得更加準確,這樣的改進會在整個圖中進行傳播,如果一直這麼鬆弛下去的話,最終整個圖所有節點的距離值都不會發生變化的時候我們就得到了從源點到所有節點的最短路徑值。
每次鬆弛可以看作是向最終解前進了“一步”,我們的目標自然是希望鬆弛的次數越少越好,關鍵就是要確定鬆弛的次數和鬆弛的順序(好的鬆弛順序可以讓我們直接朝著最優解前進,縮短演算法執行時間),後面要介紹的圖中的Bellman-Ford演算法、Dijkstra演算法以及DAG上的最短路徑問題都是如此。
現在我們考慮一個問題,如果我們對圖中的所有邊都鬆弛一遍會怎樣?可能部分頂點的距離估計值有所減小對吧,那如果再對圖中的所有邊都鬆弛一遍又會怎樣呢?可能又有部分頂點的距離估計值有所減小對吧,那到底什麼時候才會沒有改進呢?到底什麼時候可以停止了呢?
這個問題可以這麼想,假設從源點 s 到節點 v 的最短路徑是p=<v0, v1, v2, v3 ... vk>
,此時v0=s, vk=v,那除了源點 s 之外,這條路徑總共經過了其他 k 個頂點對吧,k 肯定小於 (V-1) 對吧,也就是說從節點 s 到節點 v 要經過一條最多隻有(V-1)條邊的路徑,因為每遍鬆弛都是鬆弛所有邊,那麼肯定會鬆弛路徑p 中的所有邊,我們可以保險地認為第 i 次迴圈鬆弛了邊<vi−1,vi>,這樣的話經過 k 次鬆弛遍歷,我們肯定能夠得到節點 v 的最短路徑值,再根據這條路徑最多隻有(V-1)條邊,也就說明了我們最多隻要迴圈地對圖中的所有邊都鬆弛(V-1)遍就可以得到所有節點的最短路徑值!上面的思路就是Bellman-Ford演算法了,時間複雜度是O(VE)。
下面看下演算法導論上的Bellman-Ford演算法的示例圖
[上圖的解釋,需要注意的是,如果邊的鬆弛順序不同,可能中間得到的結果不同,但是最後的結果都是一樣的:The execution of the Bellman-Ford algorithm. The source is vertex s. The d values are shown within the vertices, and shaded edges indicate predecessor values: if edge (u, v) is shaded, then π[v] = u. In this particular example, each pass relaxes the edges in the order (t, x), (t, y), (t, z), (x, t), (y, x), (y, z), (z, x), (z, s), (s, t), (s, y). (a) The situation just before the first pass over the edges. (b)-(e) The situation after each successive pass over the edges. The d and π values in part (e) are the final values. The Bellman-Ford algorithm returns TRUE in this example.]
上面的分析很好,但是我們漏考慮了一個關鍵問題,那就是如果圖中存在負權迴路的話不論我們鬆弛多少遍,圖中有些節點的最短路徑值都還是會減小,所以我們在 (V-1) 次鬆弛遍歷之後再鬆弛遍歷一次,如果還有節點的最短路徑減小的話就說明圖中存在負權迴路!這就引出了Bellman-Ford演算法的一個重要作用:判斷圖中是否存在負權迴路。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
#Bellman-Ford演算法 def bellman_ford(G, s): D, P = {s:0}, {} # Zero-dist to s; no parents for rnd in G: # n = len(G) rounds changed = False # No changes in round so far for u in G: # For every from-node... for v in G[u]: # ... and its to-nodes... if relax(G, u, v, D, P): # Shortcut to v from u? changed = True # Yes! So something changed if not changed: break # No change in round: Done else: # Not done before round n? raise ValueError('negative cycle') # Negative cycle detected return D, P # Otherwise: D and P correct #測試程式碼 s, t, x, y, z = range(5) W = { s: {t:6, y:7}, t: {x:5, y:8, z:-4}, x: {t:-2}, y: {x:-3, z:9}, z: {s:2, x:7} } D, P = bellman_ford(W, s) print [D[v] for v in [s, t, x, y, z]] # [0, 2, 4, 7, -2] print s not in P # True print [P[v] for v in [t, x, y, z]] == [x, y, s, t] # True W[s][t] = -100 print bellman_ford(W, s) # Traceback (most recent call last): # ... # ValueError: negative cycle |
前面我們在動態規劃中介紹了一個DAG圖中的最短路徑演算法,它的時間複雜度是O(V+E)的,下面我們用鬆弛的思路來快速回顧一下那個演算法的迭代版本。因為它先對頂點進行了拓撲排序,所以它是一個典型的通過修改邊鬆弛的順序來提高演算法執行速度的演算法,也就是說,我們不是隨機鬆弛,也不是所有邊來鬆弛一遍,而是沿著拓撲排序得到的節點的順序來進行鬆弛,怎麼鬆弛呢?當我們到達一個節點時我們就鬆弛這個節點的出邊,為什麼這種方式能夠奏效呢?
這裡還是假設從源點 s 到節點 v 的最短路徑是p=<v0, v1, v2, v3 ... vk>
,此時v0=s, vk=v,如果我們到達了節點 v,那麼說明源點 s 和節點 v 之間的那些點都已經經過了(節點是經過了拓撲排序的喲),而且它們的邊也都已經鬆弛過了,所以根據路徑鬆弛性質可以知道當我們到達節點 v 時我們能夠直接得到源點 s 到節點 v 的最短路徑值。
[上圖的解釋:The execution of the algorithm for shortest paths in a directed acyclic graph. The vertices are topologically sorted from left to right. The source vertex is s. The d values are shown within the vertices, and shaded edges indicate the π values. (a) The situation before the first iteration of the for loop of lines 3-5. (b)-(g) The situation after each iteration of the for loop of lines 3-5. The newly blackened vertex in each iteration was used as u in that iteration. The values shown in part (g) are the final values.]
接下來我們看下Dijkstra演算法,它看起來非常像Prim演算法,同樣是基於貪心策略,每次貪心地選擇鬆弛距離最近的“邊緣節點”所在的那條邊(另一個節點在已經包含的節點集合中),那為什麼這種方式也能奏效呢?因為演算法導論給出了完整的證明,不信你去看看!呵呵,開玩笑的啦,如果光說有證明就用不著我來寫文章咯,其實是因為Dijkstra演算法隱藏了一個DAG最短路徑演算法,而DAG的最短路徑問題我們上面已經介紹過了,仔細想也不難發現,它們的區別就是鬆弛的順序不同,DAG最短路徑演算法是先進行拓撲排序然後鬆弛,而Dijkstra演算法是每次直接貪心地選擇一條邊來鬆弛。那為什麼Dijkstra演算法隱藏了一個DAG?
[這裡我想了好久怎麼解釋,但是還是覺得原文實在太精彩,我想我這有限的水平很難講明白,故這裡附上原文,前面部分作者解釋了為什麼DAG最短路徑演算法中邊鬆弛的順序和拓撲排序有關,然後作者繼續解釋(Dijkstra演算法中)下一個要加入(到已包含的節點集合)的節點必須有正確的距離估計值,最後作者解釋了這個節點肯定是那個具有最小距離估計值的節點!一切順風順水,但是有一個重要前提條件,那就是邊不能有負權值!]
作者下面的解釋中提到的圖9-1
To get thing started, we can imagine that we already know the distances from the start node to each of the others. We don’t, of course, but this imaginary situation can help our reasoning. Imagine ordering the nodes, left to right, based on their distance. What happens? For the general case—not much. However, we’re assuming that we have no negative edge weights, and that makes all the difference.
Because all edges are positive, the only nodes that can contribute to a node’s solution will lie to its left in our hypothetical ordering. It will be impossible to locate a node to the right that will help us find a shortcut, because this node is further away, and could only give us a shortcut if it had a negative back edge. The positive back edges are completely useless to us, and aren’t part of the problem structure. What remains, then, is a DAG, and the topological ordering we’d like to use is exactly the hypothetical ordering we started with: nodes sorted by their actual distance. See Figure 9-1 for an illustration of this structure. (I’ll get back to the question marks in a minute.)
Predictably enough, we now hit the major gap in the solution: it’s totally circular. In uncovering the basic problem structure (decomposing into subproblems or finding the hidden DAG), we’ve assumed that we’ve already solved the problem. The reasoning has still been useful, though, because we now have something specific to look for. We want to find the ordering—and we can find it with our trusty workhorse, induction!
Consider, again, Figure 9-1. Assume that the highlighted node is the one we’re trying to identify in our inductive step (meaning that the earlier ones have been identified and already have correct distance estimates). Just like in the ordinary DAG shortest path problem, we’ll be relaxing all out-edges for each node, as soon as we’ve identified it and determined its correct distance. That means that we’ve relaxed the edges out of all earlier nodes. We haven’t relaxed the out-edges of later nodes, but as discussed, they can’t matter: the distance estimates of these later nodes are upper bounds, and the back-edges have positive weights, so there’s no way they can contribute to a shortcut.
This means (by the earlier relaxation properties or the discussion of the DAG shortest path algorithm in Chapter 8) that the next node must have a correct distance estimate. That is, the highlighted node in Figure 9-1 must by now have received its correct distance estimate, because we’ve relaxed all edges out of the first three nodes. This is very good news, and all that remains is to figure out which node it is. We still don’t really know what the ordering is, remember? We’re figuring out the topological sorting as we go along, step by step.
There is only one node that could possibly be the next one, of course:3 the one with the lowest distance estimate. We know it’s next in the sorted order, and we know it has a correct estimate; because these estimates are upper bounds, none of the later nodes could possibly have lower estimates. Cool, no? And now, by induction, we’ve solved the problem. We just relax all out-edges of the nodes of each node in distance order—which means always taking the one with the lowest estimate next.
下圖是演算法導論中Dijkstra演算法的示例圖,可以參考下
[上圖的解釋:The execution of Dijkstra’s algorithm. The source s is the leftmost vertex. The shortest-path estimates are shown within the vertices, and shaded edges indicate predecessor values. Black vertices are in the set S, and white vertices are in the min-priority queue Q = V – S. (a) The situation just before the first iteration of the while loop of lines 4-8. The shaded vertex has the minimum d value and is chosen as vertex u in line 5. (b)-(f) The situation after each successive iteration of the while loop. The shaded vertex in each part is chosen as vertex u in line 5 of the next iteration. The d and π values shown in part (f) are the final values.]
下面是Dijkstra演算法的實現
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
#Dijkstra演算法 from heapq import heappush, heappop def dijkstra(G, s): D, P, Q, S = {s:0}, {}, [(0,s)], set() # Est., tree, queue, visited while Q: # Still unprocessed nodes? _, u = heappop(Q) # Node with lowest estimate if u in S: continue # Already visited? Skip it S.add(u) # We've visited it now for v in G[u]: # Go through all its neighbors relax(G, u, v, D, P) # Relax the out-edge heappush(Q, (D[v], v)) # Add to queue, w/est. as pri return D, P # Final D and P returned #測試程式碼 s, t, x, y, z = range(5) W = { s: {t:10, y:5}, t: {x:1, y:2}, x: {z:4}, y: {t:3, x:9, z:2}, z: {x:6, s:7} } D, P = dijkstra(W, s) print [D[v] for v in [s, t, x, y, z]] # [0, 8, 9, 5, 7] print s not in P # True print [P[v] for v in [t, x, y, z]] == [y, t, s, y] # True |
Dijkstra演算法和Prim演算法的實現很像,也和BFS演算法實現很像,其實,如果我們把每條權值為 w 的邊(u,v)想象成節點 u 和節點 v 中間有 (w-1) 個節點,且每條邊都是權值為1的一條路徑的話,BFS演算法其實就和Dijkstra演算法差不多了。 Dijkstra演算法的時間複雜度和使用的優先佇列有關,上面的實現用的是最小堆,所以時間複雜度是O(mlgn),其中 m 是邊數,n 是節點數。
下面我們來看看所有點對最短路徑問題
對於所有點對最短路徑問題,我們第一個想法肯定是對每個節點執行一遍Dijkstra演算法就可以了嘛,但是,Dijkstra演算法有個前提條件,所有邊的權值都是正的,那些包含了負權邊的圖怎麼辦?那就想辦法對圖進行些預處理,使得所有邊的權值都是正的就可以了,那怎麼處理能夠做到呢?此時可以看下前面的三角不等性質,內容如下:
d(s,v) <= d(s,u) + W[u,v]. This is an example of the triangle inequality.
令h(u)=d(s,u), h(v)=d(s,v),假設我們給邊(u,v)重新賦權w’(u, v) = w(u, v) + h(u) – h(v),根據三角不等性質可知w’(u, v)肯定非負,這樣新圖的邊就滿足Dijkstra演算法的前提條件,但是,我們怎麼得到每個節點的最短路徑值d(s,v)?
其實這個問題很好解決對吧,前面介紹的Bellman-Ford演算法就幹這行的,但是源點 s 是什麼?這裡的解決方案有點意思,我們可以向圖中新增一個頂點 s,並且讓它連線圖中的所有其他節點,邊的權值都是0,完了之後我們就可以在新圖上從源點 s 開始執行Bellman-Ford演算法,這樣就得到了每個節點的最短路徑值d(s,v)。但是,新的問題又來了,這麼改了之後真的好嗎?得到的最短路徑對嗎?
這裡的解釋更加有意思,想想任何一條從源點 s 到節點 v 的路徑p=<s, v1, v2, v3 ... u, v>
,假設我們把路徑上的邊權值都加起來的話,你會發現下面的有意思的現象(telescoping sums):
sum=[w(s,v1)+h(s)-h(v1)]+[w(v1,v2)+h(v1)-h(v2)]+…+[w(u,v)+h(u)-h(v)] =w(v1,v2)+w(v2,v3)+…+w(u,v)-h(v)
上面的式子說明,所有從源點 s 到節點 v 的路徑都會減去h(v),也就說明對於新圖上的任何一條最短路徑,它都是對應著原圖的那條最短路徑,只是路徑的權值減去了h(v),這也就說明採用上面的策略得到的最短路徑沒有問題。
現在我們捋一捋思路,我們首先要使用Bellman-Ford演算法得到每個節點的最短路徑值,然後利用這些值修改圖中邊的權值,最後我們對圖中所有節點都執行一次Dijkstra演算法就解決了所有節點對最短路徑問題,但是如果原圖本來邊的權值就都是正的話就直接執行Dijkstra演算法就行了。這就是Johnson演算法,一個巧妙地利用Bellman-Ford和Dijkstra演算法結合來解決所有節點對最短路徑問題的演算法。它特別適合用於稀疏圖,演算法的時間複雜度是O(mnlgn),比後面要介紹的Floyd-Warshall演算法要好些。
還有一點需要補充的是,在執行完了Dijkstra演算法之後,如果我們要得到準確的最短路徑的權值的話,我們還需要做一定的修改,從前面的式子可以看出,新圖上節點 u 和節點 v 之間的最短路徑 D’(u,v) 與原圖上兩個節點的最短路徑 D(u,v) 有如下左式的關係,那麼經過右式的簡單計算就能得到原圖的最短路徑值
D’(u,v)=D(u,v)+h(u)-h(v) ==> D(u,v)=D’(u,v)-h(u)+h(v)
基於上面的思路,我們可以得到下面的Johnson演算法實現
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
#Johnson’s Algorithm def johnson(G): # All pairs shortest paths G = deepcopy(G) # Don't want to break original s = object() # Guaranteed unique node G[s] = {v:0 for v in G} # Edges from s have zero wgt h, _ = bellman_ford(G, s) # h[v]: Shortest dist from s del G[s] # No more need for s for u in G: # The weight from u... for v in G[u]: # ... to v... G[u][v] += h[u] - h[v] # ... is adjusted (nonneg.) D, P = {}, {} # D[u][v] and P[u][v] for u in G: # From every u... D[u], P[u] = dijkstra(G, u) # ... find the shortest paths for v in G: # For each destination... D[u][v] += h[v] - h[u] # ... readjust the distance return D, P # These are two-dimensional a, b, c, d, e = range(5) W = { a: {c:1, d:7}, b: {a:4}, c: {b:-5, e:2}, d: {c:6}, e: {a:3, b:8, d:-4} } D, P = johnson(W) print [D[a][v] for v in [a, b, c, d, e]] # [0, -4, 1, -1, 3] print [D[b][v] for v in [a, b, c, d, e]] # [4, 0, 5, 3, 7] print [D[c][v] for v in [a, b, c, d, e]] # [-1, -5, 0, -2, 2] print [D[d][v] for v in [a, b, c, d, e]] # [5, 1, 6, 0, 8] print [D[e][v] for v in [a, b, c, d, e]] # [1, -3, 2, -4, 0] |
下面我們看下Floyd-Warshall演算法,這是一個基於動態規劃的演算法,時間複雜度是O(n3),n是圖中節點數
假設所有節點都有一個數字編號(從1開始),我們要把原來的問題reduce成一個個子問題,子問題有三個引數:起點 u、終點 v、能經過的節點的最大編號k,也就是求從起點 u 到終點 v 只能夠經過編號為(1,2,3,…,k)的節點的最短路徑問題 (原文表述如下)
Let d(u, v, k) be the length of the shortest path that exists from node u to node v if you’re only allowed to use the k first nodes as intermediate nodes.
這個子問題怎麼考慮呢?當然還是採用之前動態規劃中常用的選擇還是不選擇這種策略,如果我們選擇不經過節點 k 的話,那麼問題變成了求從起點 u 到終點 v 只能夠經過編號為(1,2,3,…,k-1)的節點的最短路徑問題;如果我們選擇經過節點 k 的話,那麼問題變成求從起點 u 到終點 k 只能夠經過編號為(1,2,3,…,k-1)的節點的最短路徑問題與求從起點 k 到終點 v 只能夠經過編號為(1,2,3,…,k-1)的節點的最短路徑問題之和,如下圖所示
經過上面的分析,我們可以得到下面的結論
d(u,v,k) = min(d(u,v,k-1), d(u,k,k-1) + d(k,v,k-1))
根據這個式子我們很快可以得到下面的遞迴實現
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
#遞迴版本的Floyd-Warshall演算法 from functools import wraps def memo(func): cache = {} # Stored subproblem solutions @wraps(func) # Make wrap look like func def wrap(*args): # The memoized wrapper if args not in cache: # Not already computed? cache[args] = func(*args) # Compute & cache the solution return cache[args] # Return the cached solution return wrap # Return the wrapper def rec_floyd_warshall(G): # All shortest paths @memo # Store subsolutions def d(u,v,k): # u to v via 1..k if k==0: return G[u][v] # Assumes v in G[u] return min(d(u,v,k-1), d(u,k,k-1) + d(k,v,k-1)) # Use k or not? return {(u,v): d(u,v,len(G)) for u in G for v in G} # D[u,v] = d(u,v,n) #測試程式碼 a, b, c, d, e = range(1,6) # One-based W = { a: {c:1, d:7}, b: {a:4}, c: {b:-5, e:2}, d: {c:6}, e: {a:3, b:8, d:-4} } for u in W: for v in W: if u == v: W[u][v] = 0 if v not in W[u]: W[u][v] = inf D = rec_floyd_warshall(W) print [D[a,v] for v in [a, b, c, d, e]] # [0, -4, 1, -1, 3] print [D[b,v] for v in [a, b, c, d, e]] # [4, 0, 5, 3, 7] print [D[c,v] for v in [a, b, c, d, e]] # [-1, -5, 0, -2, 2] print [D[d,v] for v in [a, b, c, d, e]] # [5, 1, 6, 0, 8] print [D[e,v] for v in [a, b, c, d, e]] # [1, -3, 2, -4, 0] |
仔細看的話,不難發現這個解法和我們介紹動態規劃時介紹的最長公共子序列的問題非常類似,如果還沒有閱讀的話不妨看下最長公共子序列問題的5種實現這篇文章,有了對最長公共子序列問題的理解,我們就很容易發現對於Floyd-Warshall演算法我們也可以採用類似的方式來減小演算法所需佔用的空間,當然首先要將遞迴版本改成效能更好些的迭代版本。
Floyd-Warshall演算法的遞推公式
從遞推公式中可以看出,計算當前回合(k)只需要上一回合(k-1)得到的結果,所以,如果應用對於中間結果不需要的話,那麼可以只使用2個nxn的矩陣,一個儲存當前回合(k)的結果D(k),另一個儲存上一回合(k-1)的結果D(k-1),待當前回合計算完了之後將其全部複製到D(k-1)中,這樣就僅需要O(n2)的空間。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
#空間優化後的Floyd-Warshall演算法 def floyd_warshall1(G): D = deepcopy(G) # No intermediates yet for k in G: # Look for shortcuts with k for u in G: for v in G: D[u][v] = min(D[u][v], D[u][k] + D[k][v]) return D #測試程式碼 a, b, c, d, e = range(1,6) # One-based W = { a: {c:1, d:7}, b: {a:4}, c: {b:-5, e:2}, d: {c:6}, e: {a:3, b:8, d:-4} } for u in W: for v in W: if u == v: W[u][v] = 0 if v not in W[u]: W[u][v] = inf D = floyd_warshall1(W) print [D[a][v] for v in [a, b, c, d, e]] # [0, -4, 1, -1, 3] print [D[b][v] for v in [a, b, c, d, e]] # [4, 0, 5, 3, 7] print [D[c][v] for v in [a, b, c, d, e]] # [-1, -5, 0, -2, 2] print [D[d][v] for v in [a, b, c, d, e]] # [5, 1, 6, 0, 8] print [D[e][v] for v in [a, b, c, d, e]] # [1, -3, 2, -4, 0] |
當然啦,一般情況下求最短路徑問題我們還需要知道最短路徑是什麼,這個時候我們只需要在進行選擇的時候設定一個前驅節點就行了
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
#最終版本的Floyd-Warshall演算法 def floyd_warshall(G): D, P = deepcopy(G), {} for u in G: for v in G: if u == v or G[u][v] == inf: P[u,v] = None else: P[u,v] = u for k in G: for u in G: for v in G: shortcut = D[u][k] + D[k][v] if shortcut < D[u][v]: D[u][v] = shortcut P[u,v] = P[k,v] return D, P #測試程式碼 a, b, c, d, e = range(5) W = { a: {c:1, d:7}, b: {a:4}, c: {b:-5, e:2}, d: {c:6}, e: {a:3, b:8, d:-4} } for u in W: for v in W: if u == v: W[u][v] = 0 if v not in W[u]: W[u][v] = inf D, P = floyd_warshall(W) print [D[a][v] for v in [a, b, c, d, e]]#[0, -4, 1, -1, 3] print [D[b][v] for v in [a, b, c, d, e]]#[4, 0, 5, 3, 7] print [D[c][v] for v in [a, b, c, d, e]]#[-1, -5, 0, -2, 2] print [D[d][v] for v in [a, b, c, d, e]]#[5, 1, 6, 0, 8] print [D[e][v] for v in [a, b, c, d, e]]#[1, -3, 2, -4, 0] print [P[a,v] for v in [a, b, c, d, e]]#[None, 2, 0, 4, 2] print [P[b,v] for v in [a, b, c, d, e]]#[1, None, 0, 4, 2] print [P[c,v] for v in [a, b, c, d, e]]#[1, 2, None, 4, 2] print [P[d,v] for v in [a, b, c, d, e]]#[1, 2, 3, None, 2] print [P[e,v] for v in [a, b, c, d, e]]#[1, 2, 3, 4, None] |
[演算法導論在介紹所有節點對最短路徑問題時先介紹了另一個基於動態規劃的解法,但是那個演算法時間複雜度較高,即使是使用了重複平方技術還是比較差,所以這裡不介紹了,但是有意思的是書中將這個演算法和矩陣乘法運算進行了對比,發現兩者之間驚人的相似,其實同理,我們開始介紹的Bellman-Ford演算法和矩陣與向量的乘法運算也有很多類似的地方,感興趣可以自己探索下,也可以閱讀演算法導論瞭解下]
本章節最後作者還提出了兩個用來解最短路徑問題的技巧:“Meeting in the middle” 和 “Knowing where you’re going,”,這部分的內容又都比較難翻譯和理解,感興趣還是閱讀原文較好
(1)Meeting in the middle
簡單來說就是雙向進行,Dijkstra演算法是從節點 u 出發去找到達節點 v 的最短路徑,但是,如果兩個節點同時進行呢,當它們找到相同的節點時就得到一條路徑了,這種方式比一個方向查詢的效率要高些,下圖是一個圖示
(2)Knowing where you’re going
這裡作者介紹了大名鼎鼎的A*演算法,實際上也就非常類似採用了分支限界策略的BFS演算法(the best-first search used in the branch and bound strategy )。
By now you’ve seen that the basic idea of traversal is pretty versatile, and by simply using different queues, you get several useful algorithms. For example, for FIFO and LIFO queues, you get BFS and DFS, and with the appropriate priorities, you get the core of Prim’s and Dijkstra’s algorithms. The algorithm described in this section, called A*, extends Dijkstra’s, by tweaking the priority once again.
As mentioned earlier, the A* algorithm uses an idea similar to Johnson’s algorithm, although for a different purpose. Johnson’s algorithm transforms all edge weights to ensure they’re positive, while ensuring that the shortest paths are still shortest. In A*, we want to modify the edges in a similar fashion, but this time the goal isn’t to make the edges positive—we’re assuming they already are (as we’re building on Dijkstra’s algorithm). No, what we want is to guide the traversal in the right direction, by using information of where we’re going: we want to make edges moving away from our target node more expensive than those that take us closer to it.
練習題:來自演算法導論24-3 貨幣兌換問題
簡單來說就是在給定的不同貨幣的兌換率下是否存在一個貨幣兌換迴圈使得最終我們能夠從中獲利?[提示:Bellman-Ford演算法]