研學社 · 入門組 | 第九期:智慧的哲思

使用者d8a171發表於2017-07-09

近些年,人工智慧領域發生了飛躍性的突破,更使得許多科技領域的學生或工作者對這一領域產生了濃厚的興趣。在入門人工智慧的道路上,The Master Algorithm 可以說是必讀書目之一,其重要性不需多言。作者 Pedro Domingos 看似只是粗略地介紹了機器學習領域的主流思想,然而幾乎所有當今已出現的、或未出現的重要應用均有所提及。本書既適合初學者從更宏觀的角度概覽機器學習這一領域,又埋下無數伏筆,讓有心人能夠對特定技術問題進行深入學習,是一本不可多得的指導性入門書籍。詼諧幽默的行文風格也讓閱讀的過程充滿趣味。

以這本書為載體,機器之心「人工智慧研學社 · 入門組」近期將正式開班!

加入方式

我們邀請所有對人工智慧、機器學習感興趣的初學者加入我們,透過對 The Master Algorithm 的閱讀與討論,宏觀、全面地瞭解人工智慧的發展歷史與技術原理。


第九章 複習

章節總結


第九章是一個邏輯推理章節,它始於科學發現背後的原則。各種各樣的現象暗示著相同的原則,而且可以用統一的原則來解釋, 所以定義和發展一切的統一者是很重要的。怎樣來結合不同的運演算法則呢?作者提供了兩種方法。第一種是元學習(Metalearning), 比如堆疊,袋裝和提升。但它不夠深入且計算昂貴。第二種方法是主演算法(master algorithm),它是一種結合不同運算方法的統一體。這章內容很好的總結了各種部分的特徵:


ARepresentationEvaluationOptimization
符號學家邏輯精確度反演
連線主義神經網路平方差梯度下降
進化論者遺傳專案適合度遺傳搜尋
貝葉斯學派圖模型後驗機率機率推理
類推學者支援向量間隔約束最佳化

表示(representation)是指學習者表達模型時的一種正式語言。評估(evaluation)部分是一個說明模型有多好的評分函式。最最佳化(optimization)是搜尋並返回最高分模型的運演算法則。在探索之後,我們達到的統一學習者可以使用MLN表示,其中後驗機率為評估函式, 遺傳搜尋和梯度下降作為最佳化器。如果我們想要,我們可以透過一些其他的精確方法替代後驗機率,或透過爬山演算法執行基因搜尋。



到目前為止,我們似乎已經掌握了主演算法的初始狀態, 在本章節中它被稱為鍊金術(Alchemy)。鍊金術和其他方法之間的轉換很容易。然而,它仍然不夠完善,具有很多缺點,比如大規模的應用。最後,作者提到了CanceRx專案,該系統會被投以癌症基因資料和相對應的治療藥品。它是前途光明的代表性應用之一。



第8周問答集



堆疊,袋裝,提升演算法有什麼區別?

  • a. 堆疊是透過學習者的預測來代替每個原始樣本的屬性而學習權重,然後選擇那些經常預測正確的學習者。
  • b. 袋裝透過重新取樣產生隨機變化的訓練集,對每個樣本應用相同的學習者,並透過投票綜合結果。
  • c. 提升: 不是結合不同的學習者, 提升方法對資料重複應用相同的分類器,使用每個新模型改正以前的錯誤。它透過分配權重來訓練樣本; 在每輪學習結束後,每個錯誤分類樣本所佔的權重都會增長,導致了後輪會更關注它。

如何結合邏輯和機率?

  • Markov邏輯網路(MLN)

鍊金術有什麼挑戰?

  • 它還沒有擴充套件到真正的大資料, 在機器學習領域沒有達到博士的人將很難使用。

Recommended Reference


第10章 複習


章節總結



正如在本章節開始時提到的,“本章將幫助你充分利用你的生活,並且為下一步做好準備”。如果主演算法允許使用者插入評分函式(你認為學習者的目標是什麼,或者,更精確的說,它的所有者的目的是什麼)和資料(你認為它知道什麼)建立自己的模型, 你想要什麼樣的模型?你將提供什麼資料?作者還將討論智慧模型如何工作,人們如何使用它們,社會將會受到什麼影響等。然後,作者深入瞭解了資料和資料隱私,解釋使用者擁有的資料型別, 他們可以如何以及在哪分享這些資料。之後,作者探討了各種有爭議的機器學習的相關主題: 智慧機器如何竊取我的工作?如果我們有了機器人戰爭該怎麼辦?人工智慧的威脅依舊存在麼?機器只能如何影響人類進化?



重要章節



性別,謊言和機器學習

  • 作者以網路約會為例展示了現今機器學習應用中使用者與學習者之間的溝通渠道是如何狹窄。這歸結為“學習者能夠擁有的模型有多好,以及你想用模型解決什麼的問題”。

數碼鏡

  • 這個簡短的部分談論了資料模式的“你”可以帶來的自我提升。例如,它可以閱讀和回覆你的電子郵件,過濾你想要閱讀的內容,查詢日期,管理個人生活等。

模型社會

  • 作者指出,如果模型可以做那麼多事情,就像求職和公司面試一樣,那麼使用者生活的世界也將稱為一個模型。

分享或不分享,如何,以及在哪裡

  • 在這個章節,作者探討的主要問題是:我們應該分享什麼資料?怎麼分享?我們在哪裡可以分享?

神經網路偷走了我的工作

  • 作者首先指出“可以從資料中輕鬆學習的狹義的任務(已被機器人取代),但是那些需要廣泛結合技能和知識的任務,機器人無法取代”。作者也認為,當自動化在市場上接管了一些工作並創造了一些其他的工作時,人類應該充分利用機器而不是與它們競爭。

戰爭不適合人類

  • 這部分顯示了人類可能還沒有準備好進行機器人戰爭的兩個原因。首先,教導機器人認識相關概念和學習道德規範可能不是最好的選擇, 因為人類又是也違反他們的道德原則,而這可能會混淆機器人。第二,機器人會為了戰爭改變道德規範,因為人們不需要進入戰場或面對生死有關的情況。

谷歌+主演算法=skynet?

  • 關於人類與人工智慧的威脅,作者提到,只要人類關心評估部分,人工智慧就永遠是我們的最好的工作夥伴而不是威脅。

進化,第2部分

  • 作者認為自然已經經歷了三個階段:進化,大腦,和文化。“機器學習是下一個階段的一部分,它反映了人類在未來有可能定向地進化。

重要概念

  • 心理理論:你心中的計算機理論
  • 轉折點:機器智慧超出人類智慧時

測驗

  1. 對於什麼工作會被自動化取代作者是什麼態度?
  2. 人們對於智慧機器的三大擔憂是什麼?
  3. 心理理論是什麼?
  4. 轉折點是什麼?
  5. 使用者將有的四種資料是什麼?

Chapter #9 Review

【Chapter Summary】

Chapter 9 is a logical section. It begins with the principles behind the scientific discovery. Various phenomenon indicates the same principle and could be explained by a unified principle, so the unifier to define and develop everything would be substantial. How to combine the different algorithms? There are two ways provided by the author. The first one is using the metalearning such as stacking, bagging and boosting. But it is not deep enough and computationally expensive. The second one is the master algorithm, which is a kind of unifier for the different algorithms. The features from the various sectors are summarized very well in this chapter:

Representation here is the formal language in which the learner expresses its models. The evaluation component is a scoring function that says how good a model is. Optimization is the algorithm that searches for the highest-scoring model and returns it. After the exploration of the virtual city, the unified learner we’ve arrived at uses MLNs as the representation, posterior probability as the evaluation function, and genetic search coupled with gradient descent as the optimizer. If we want, we can easily replace the posterior by some other accuracy measure, or genetic search by hill climbing.

So far it seems we have already got the initial status of the Master Algorithm, in this section it is called Alchemy. The transformation among the alchemy and other algorithms are easy. However, it is still underdeveloped with many disadvantages such as the large scale application. Finally, the author mentioned the CanceRx, could be treated as a program which is fed with eh cancer's genome and provides the drug to kill it with. It is one of the representative application with bright future.

Week 8 Q & A Collection

  1. What is different among stacking, bagging and boosting?
  2. Stacking is to “replace the attributes of each original example by the learners' predictions to learn the weights and then choose the learners that often predict the correct class”.Bagging “generates random variations of the training set by resampling, applies the same learner to each one, and combines the results by voting”Boosting: Instead of combining different learners, boosting repeatedly applies the same classier to the data, using each new model to correct the previous ones’ mistakes. It does this by assigning weights to the training examples; the weight of each misclassified example is increased after each round of learning, causing later rounds to focus more on it.
  3. How to combine the logic and probability?
  4. Markov logic network (MLN)
  5. What are the challenges of the Alchemy?
  6. “It does not yet scale to truly big data, and someone without a PhD in machine learning will find it hard to use.”

Recommended Reference

Chapter #10 Preview

【Chapter Summary】

As the author mentions at the beginning of the chapter, “this chapter will help you make the most of it in your life and be ready for what comes next.” If the Master Algorithm allows users to plug in the score function (what you think the learner’s goals are, or, more precisely, its owner’s) and the data (what you think it knows) to build your own model, what model do you want to have? What data would you feed it? The author opens up a discussion about how intelligent models work, how people use them and how the society will be affected. Then, the author dives into data and data privacy, explaining what kind of data that users have, how and where they can share it. After that, the author explores various controversial machine learning related topics: How did intelligent machines steal my job? What if we have a robot war? Does AI threat still exist? How can intelligent machines influence the human evolution?

【Important Sections】

  • Sex, Lies and machine learning
  • The author uses online dating as an example to show how narrow the communication channel between the users and the learner in machine learning application is nowadays. It comes down to “the question of how good a model of you a learner can have and what you'd want to do with the model.”
  • The digital mirror
  • This short section talks about the self-improvement that the digital model of “you” can bring. For example, it can read and reply your emails for you, filter what you want to read next, find a date and manage personal life.
  • A society of models
  • The author points out the phenomena that if the model can do so much, things like job hunting and company interviewing, for the user, then the model will become a model of the world in which the user lives.
  • To share or not to share, and how and where
  • The main questions the author explores in this section are: What data should we share? How to share? Where can we share?
  • A neural network stole my job
  • The author first points out that “narrowly defined tasks are easily learned from data [are replaced by robots], but tasks that require a broad combination of skills and knowledge aren't.” The author also thinks that humans should make the best of machine rather than competing with them, while automation takes away some jobs but creates some others in the market.
  • War is not for humans
  • This section shows that human might not be ready for robot war yet for two main reasons. First, teaching robots to recognize relevant concepts and learn ethics by observing humans might not be the best option since humans sometimes violate their ethical principles as well, and this might confuse robots. Second, robots change the ethics for war since people don't need to go to the battlefield or face life-death situations.
  • Google + Master Algorithm = Skynet?
  • In terms of human and AI threats, the author mentions that as long as the human takes care of the evaluation part, then AI will always be the best working partner we have instead of a threat.
  • Evolution, part 2
  • The author thinks that nature has already been through its three phases — evolution, the brain, and culture — and “machine learning is the local next stage of this progression”, which reflects that human-directed evolution might be possible in the future.

【Key Concepts】

  • Theory of mind: the computer theory of your mind.
  • Turning point: the point where machine intelligence exceeds human intelligence.

【Quiz】

  1. What's the author's attitude toward what job will be replaced by automation?
  2. What are the three worries people have about intelligence machines?
  3. What is theory of mind?
  4. What is Turning point?
  5. What are the four kinds of data a user will have?

相關文章