



圖靈機模型(Turing machine): A Turing machine is a simple (abstract) device that can read from, write to, and move along an infinitely long strip of paper. The actual behavior of the machines varies. Each is a so-called finite state machine: it has a finite set of states (some of which indicate that it has finished), and every symbol it reads potentially triggers reading and/or writing and switching to a different state. You can think of this machinery as a set of rules. (“If I am in state 4 and see an X, I move one step to the left, write a Y, and switch to state 9.”)

RAM模型(random-access machine):標準的單核計算機,它大致有下面三個性質

• We don’t have access to any form of concurrent execution; the machine simply executes one instruction after the other.


• Standard, basic operations (such as arithmetic, comparisons, and memory access) all take constant (although possibly different) amounts of time. There are no more complicated basic operations (such as sorting).


• One computer word (the size of a value that we can work with in constant time) is not unlimited but is big enough to address all the memory locations used to represent our problem, plus an extra percentage for our variables.


演算法的本質: An algorithm is a procedure, consisting of a finite set of steps (possibly including loops and conditionals) that solves a given problem in finite time.

the notion of running time complexity (as described in the next section) is based on knowing how big a problem instance is, and that size is simply the amount of memory needed to encode it.




演算法導論介紹到,對於三個符號可以做如下理解:O = ≤,Ω = ≥, Θ = =


幾種常見的執行時間以及演算法例項 點選這裡可以參考下wiki中的時間複雜度


(1)Tip 1: If possible, don’t worry about it.


(2)Tip 2: For timing things, use timeit.


(3)Tip 3: To find bottlenecks, use a profiler.

使用cProfile模組來獲取更多的關於執行情況的內容,從而可以發現問題的瓶頸,如果系統沒有cProfile模組,可以使用profile模組代替,關於這兩者的更多內容可以檢視Python standard library-Python Profilers

(4)Tip 4: Plot your results.


(5)Tip 5: Be careful when drawing conclusions based on timing comparisons.


First, any differences you observe may be because of random variations.


Second, there are issues when comparing averages.


At the very least, you should stick to comparing averages of actual timings. A common practice to get more meaningful numbers when performing timing experiments is to normalize the running time of each program, dividing it by the running time of some standard, simple algorithm. This can indeed be useful but can in some cases make your results less than meaningful. See the paper “How not to lie with statistics: The correct way to summarize benchmark results” by Fleming and Wallace for a few pointers. For some other perspectives, you could read Bast and Weber’s “Don’t compare averages,” or the more recent paper by Citron et al., “The harmonic or geometric mean: does it really matter?”

Third, your conclusions may not generalize.


(6)Tip 6: Be careful when drawing conclusions about asymptotics from experiments.

在對從實驗中得到關於漸近時間的資訊下結論時需要小心,實驗只是對於理論的一個支撐,可以通過實驗來推翻一個漸近時間結果的假設,但是反過來一般不行 [以下是作者的解釋]

If you want to say something conclusively about the asymptotic behavior of an algorithm, you need to analyze it, as described earlier in this chapter. Experiments can give you hints, but they are by their nature finite, and asymptotics deal with what happens for arbitrarily large data sizes. On the other hand, unless you’re working in theoretical computer science, the purpose of asymptotic analysis is to say something about the behavior of the algorithm when implemented and run on actual problem instances, meaning that experiments should be relevant.


Python中很多地方都使用了hash策略,在前面的Python資料結構篇中的搜尋部分已經介紹了hash的內容。Python提供了hash函式,例如hash("Hello, world!")得到-943387004357456228 (結果不一定相同)。Python中的dict和set都使用了hash機制,所以平均情況下它們獲取元素都是常數時間的。

(1)圖的表示:最常用的兩種表示方式是鄰接表和鄰接矩陣 [假設要表示的圖如下]

鄰接表 Adjacency Lists:因為歷史原因,鄰接表往往都是指連結串列list,但實際上也可以是其他的,例如在python中也可以是set或者dict,不同的表示方式有各自的優缺點,它們判斷節點的連線關係和節點的度的方式甚至兩個操作的效能都不太一樣。

① adjacency lists 表示形式

② adjacency sets 表示形式

基本上和adjacency lists表示形式一樣對吧?但是,對於list,判斷一個元素是否存在是線性時間O(N(v)),而在set中是常數時間O(1),所以對於稠密圖使用adjacency sets要更加高效。

③ adjacency dicts 表示形式



鄰接矩陣 Adjacency Matrix




(2)樹的表示 [假設要表示下面的樹]




[Bunch Pattern]:有意思的是,上面的實現方式使用了Python中一種常用的設計模式,叫做Bunch Pattern,貌似來自經典書籍Python Cookbook,原書介紹如下:


When prototyping (or even finalizing) data structures such as trees, it can be useful to have a flexible class that will allow you to specify arbitrary attributes in the constructor. In these cases, the “Bunch” pattern (named by Alex Martelli in the Python Cookbook) can come in handy. There are many ways of implementing it, but the gist of it is the following:

There are several useful aspects to this pattern. First, it lets you create and set arbitrary attributes by supplying them as command-line arguments:

Second, by subclassing dict, you get lots of functionality for free, such as iterating over the keys/attributes or easily checking whether an attribute is present. Here’s an example:

This pattern isn’t useful only when building trees, of course. You could use it for any situation where you’d want a flexible object whose attributes you could set in the constructor.


• NetworkX:

• python-graph:

• Graphine:

• Pygr: a graph database

• Gato: a graph animation toolbox

• PADS: a collection of graph algorithms


In general, the more important your program, the more you should mistrust such black boxes and seek to find out what’s going on under the cover.


(1)Hidden Squares 隱藏的平方執行時間


(2)The Trouble with Floats 精度帶來的煩惱





更多和Python中的浮點數有關的內容可以檢視Floating Point Arithmetic: Issues and Limitations

問題2-12. (圖的表示)

Consider the following graph representation: you use a dictionary and let each key be a pair (tuple) of two nodes, with the corresponding value set to the edge weight. For example W[u, v] = 42. What would be the advantages and disadvantages of this representation? Could you supplement it to mitigate the downsides?

The advantages and disadvantages depend on what you’re using it for. It works well for looking up edge weights efficiently but less well for iterating over the graph’s nodes or a node’s neighbors, for example. You could improve that part by using some extra structures (for example, a global list of nodes, if that’s what you need or a simple adjacency list structure, if that’s required).
