128篇論文,21大領域,深度學習最值得看的資源(附一鍵下載)

AI科技大本營發表於2017-07-04

128篇論文,21大領域,深度學習最值得看的資源(附一鍵下載)

從全域性到枝幹、從經典到前沿、從理論到應用、還有最新的研究...,所有你不需要的需要的,現在不需要的未來需要的,你不需要的周邊小夥伴需要的...反正全都在這了。拿走不謝,就在AI科技大本營。


整理 | AI科技大本營(rgznai100)

參考 - https://zhuanlan.zhihu.com/p/23080129

對於大多數想上手深度學習的小夥伴來說,“我應當從那篇論文開始讀起?”

這是一個亙古不變的話題。

而對那些已經入門的同學來說,瞭解一下不同方向的論文,也是不時之需。

有沒有一份完整的深度學習論文導引,讓所有人都可以在裡面找到想要的內容呢?

有!

今天就給大家分享一篇史上最牛的深度學習論文整合合集。它讓大家對整個深度學習領域及其個枝幹都能有一個相對完整的理解。

這份閱讀列表的組織原則是這樣的:

  • 從全域性到枝幹:從綜述類、全域性性的文章到細分領域的具體論文。

  • 從經典到最前沿:每個話題的文章都是按時間順序來排的,可以清晰給出每個方向的發展脈絡。

  • 從通用理論到具體應用:有些論文是針對深度學習通用理論的,有些論文章則針對具體的應用領域。

  • 專注於最先進的研究:收集有許多最新論文,保證閱讀列表的時效性。

當然,這裡的每個話題都只選幾篇最具代表性的論文,深入研究的話,還需要更進一步的閱讀。

基於這些論文的影響力,你會發現很多新近發表的文章也值得一讀。此外,這份閱讀列表在原文頁面會不斷更新,值得你時時備查。

github.com/songrotek/D…

想一鍵打包下載所有的論文?沒問題,AI科技大本營已經給你準備好了懶人專屬通道。請在公眾號會話回覆“路徑”,即可獲取本文所有論文PDF資料。

1. 深度學習基礎及歷史

1.0 書

[0] 深度學習聖經 ★★★★★

Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville. "Deep learning." An MIT Press book. (2015).

https://github.com/HFTrader/DeepLearningBook/raw/master/DeepLearningBook.pdf

1.1 報告

[1] 三巨頭報告★★★★★

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature 521.7553 (2015): 436-444.

http://www.cs.toronto.edu/%7Ehinton/absps/NatureDeepReview.pdf

1.2 深度信念網路 (DBN)

[2] 深度學習前夜的里程碑

★★★

Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527-1554.

http://www.cs.toronto.edu/%7Ehinton/absps/ncfast.pdf

[3] 展示深度學習前景的里程碑

★★★

Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." Science 313.5786 (2006): 504-507.

http://www.cs.toronto.edu/%7Ehinton/science.pdf

1.3 ImageNet革命(深度學習大爆炸)

[4] AlexNet的深度學習突破

★★★

Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.

http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

[5] VGGNet深度神經網路出現

★★★

Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).

https://arxiv.org/pdf/1409.1556.pdf

[6] GoogLeNet

★★★

Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

[7] ResNet極深度神經網路,CVPR最佳論文

★★★★★

He, Kaiming, et al. "Deep residual learning for image recognition." arXiv preprintarXiv:1512.03385 (2015).

https://arxiv.org/pdf/1512.03385.pdf

1.4 語音識別革命

[8] 語音識別突破

★★★★

Hinton, Geoffrey, et al. "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups." IEEE Signal Processing Magazine 29.6 (2012): 82-97.

http://cs224d.stanford.edu/papers/maas_paper.pdf

[9] RNN論文

★★★

Graves, Alex, Abdel-rahman Mohamed, and Geoffrey Hinton. "Speech recognition with deep recurrent neural networks." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013.

http://arxiv.org/pdf/1303.5778.pdf

[10] 端對端RNN語音識別

★★★

Graves, Alex, and Navdeep Jaitly. "Towards End-To-End Speech Recognition with Recurrent Neural Networks." ICML. Vol. 14. 2014.

http://www.jmlr.org/proceedings/papers/v32/graves14.pdf

[11] Google語音識別系統論文

★★★

Sak, Haşim, et al. "Fast and accurate recurrent neural network acoustic models for speech recognition." arXiv preprint arXiv:1507.06947 (2015).

http://arxiv.org/pdf/1507.06947

[12] 百度語音識別系統論文

★★★★

Amodei, Dario, et al. "Deep speech 2: End-to-end speech recognition in english and mandarin." arXiv preprint arXiv:1512.02595 (2015).

https://arxiv.org/pdf/1512.02595.pdf

[13] 來自微軟的當下最先進的語音識別論文

★★★★

W. Xiong, J. Droppo, X. Huang, F. Seide, M. Seltzer, A. Stolcke, D. Yu, G. Zweig "Achieving Human Parity in Conversational Speech Recognition." arXiv preprint arXiv:1610.05256 (2016).

https://arxiv.org/pdf/1610.05256v1

讀完上面這些論文,你將對深度學習的歷史、深度學習模型(CNN、RNN、LSTM等)的基本架構有一個基本認識,並能理解深度學習是如何解決影象及語音識別問題的。接下來的論文將帶你深入理解深度學習方法、深度學習在前沿領域的不同應用。根據自己的興趣和研究方向選擇閱讀即可:

2. 深度學習方法

2.1 模型

[14] Dropout

★★★

Hinton, Geoffrey E., et al. "Improving neural networks by preventing co-adaptation of feature detectors." arXiv preprint arXiv:1207.0580 (2012).

https://arxiv.org/pdf/1207.0580.pdf

[15] 過擬合

★★★

Srivastava, Nitish, et al. "Dropout: a simple way to prevent neural networks from overfitting." Journal of Machine Learning Research 15.1 (2014): 1929-1958.

http://www.jmlr.org/papers/volume15/srivastava14a.old/source/srivastava14a.pdf

[16] Batch歸一化——2015年傑出成果

★★★★

Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).

http://arxiv.org/pdf/1502.03167

[17] Batch歸一化的升級

★★★★

Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016).

https://arxiv.org/pdf/1607.06450.pdf

[18] 快速訓練新模型

★★★

Courbariaux, Matthieu, et al. "Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to+ 1 or−1."

https://pdfs.semanticscholar.org/f832/b16cb367802609d91d400085eb87d630212a.pdf

[19] 訓練方法創新

★★★★★

Jaderberg, Max, et al. "Decoupled neural interfaces using synthetic gradients." arXiv preprint arXiv:1608.05343 (2016).

https://arxiv.org/pdf/1608.05343

[20] 修改預訓練網路以降低訓練耗時

★★★

Chen, Tianqi, Ian Goodfellow, and Jonathon Shlens. "Net2net: Accelerating learning via knowledge transfer." arXiv preprint arXiv:1511.05641 (2015).

https://arxiv.org/abs/1511.05641

[21] 修改預訓練網路以降低訓練耗時

★★★

Wei, Tao, et al. "Network Morphism." arXiv preprint arXiv:1603.01670 (2016).

https://arxiv.org/abs/1603.01670

2.2 優化

[22] 動量優化器

★★

Sutskever, Ilya, et al. "On the importance of initialization and momentum in deep learning." ICML (3) 28 (2013): 1139-1147.

http://www.jmlr.org/proceedings/papers/v28/sutskever13.pdf

[23] 可能是當前使用最多的隨機優化

★★★

Kingma, Diederik, and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).

http://arxiv.org/pdf/1412.6980

[24] 神經優化器

★★★★★

Andrychowicz, Marcin, et al. "Learning to learn by gradient descent by gradient descent." arXiv preprint arXiv:1606.04474 (2016).

https://arxiv.org/pdf/1606.04474

[25] ICLR最佳論文,讓神經網路執行更快的新方向★★★★★

Han, Song, Huizi Mao, and William J. Dally. "Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding." CoRR, abs/1510.00149 2 (2015).

https://pdfs.semanticscholar.org/5b6c/9dda1d88095fa4aac1507348e498a1f2e863.pdf

[26] 優化神經網路的另一個新方向

★★★★

Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MB model size." arXiv preprint arXiv:1602.07360 (2016).

http://arxiv.org/pdf/1602.07360

2.3 無監督學習 / 深度生成式模型

[27] Google Brain找貓的里程碑論文,吳恩達

★★★★

Le, Quoc V. "Building high-level features using large scale unsupervised learning." 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, 2013.

http://arxiv.org/pdf/1112.6209.pdf

[28] 變分自編碼機 (VAE)

★★★★

Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013).

http://arxiv.org/pdf/1312.6114

[29] 生成式對抗網路 (GAN)

★★★★★

Goodfellow, Ian, et al. "Generative adversarial nets." Advances in Neural Information Processing Systems. 2014.

http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf

[30] 解卷積生成式對抗網路 (DCGAN)

★★★★

Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).

http://arxiv.org/pdf/1511.06434

[31] Attention機制的變分自編碼機

★★★★★

Gregor, Karol, et al. "DRAW: A recurrent neural network for image generation." arXiv preprint arXiv:1502.04623 (2015).

http://jmlr.org/proceedings/papers/v37/gregor15.pdf

[32] PixelRNN

★★★★

Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).

http://arxiv.org/pdf/1601.06759

[33] PixelCNN

★★★★

Oord, Aaron van den, et al. "Conditional image generation with PixelCNN decoders." arXiv preprint arXiv:1606.05328 (2016).

https://arxiv.org/pdf/1606.05328

2.4 RNN / 序列到序列模型

[34] RNN的生成式序列,LSTM

★★★★

Graves, Alex. "Generating sequences with recurrent neural networks." arXiv preprint arXiv:1308.0850 (2013).

http://arxiv.org/pdf/1308.0850

[35] 第一份序列到序列論文

★★★★

Cho, Kyunghyun, et al. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).

http://arxiv.org/pdf/1406.1078

[36] 神經網路的序列到序列學習

★★★★★

Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014.

http://papers.nips.cc/paper/5346-information-based-learning-by-agents-in-unbounded-state-spaces.pdf

[37] 神經機器翻譯

★★★★

Bahdanau, Dzmitry, KyungHyun Cho, and Yoshua Bengio. "Neural Machine Translation by Jointly Learning to Align and Translate." arXiv preprint arXiv:1409.0473 (2014).

https://arxiv.org/pdf/1409.0473v7.pdf

[38] 序列到序列Chatbot

★★★

Vinyals, Oriol, and Quoc Le. "A neural conversational model." arXiv preprint arXiv:1506.05869 (2015).

http://arxiv.org/pdf/1506.05869.pdf%20(http://arxiv.org/pdf/1506.05869.pdf

2.5 神經網路圖靈機

[39] 未來計算機的基本原型

★★★★★

Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:1410.5401 (2014).

http://arxiv.org/pdf/1410.5401.pdf

[40] 強化學習神經圖靈機

★★★

Zaremba, Wojciech, and Ilya Sutskever. "Reinforcement learning neural Turing machines." arXiv preprint arXiv:1505.00521 362 (2015).

https://pdfs.semanticscholar.org/f10e/071292d593fef939e6ef4a59baf0bb3a6c2b.pdf

[41] 記憶網路

★★★

Weston, Jason, Sumit Chopra, and Antoine Bordes. "Memory networks." arXiv preprint arXiv:1410.3916 (2014).

http://arxiv.org/pdf/1410.3916

[42] 端對端記憶網路

★★★★

Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems. 2015.

http://papers.nips.cc/paper/5846-end-to-end-memory-networks.pdf

[43] 指標網路

★★★★

Vinyals, Oriol, Meire Fortunato, and Navdeep Jaitly. "Pointer networks." Advances in Neural Information Processing Systems. 2015.

http://papers.nips.cc/paper/5866-pointer-networks.pdf

[44] 整合神經網路圖靈機概念的里程碑論文

★★★★★

Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature (2016).

https://www.dropbox.com/s/0a40xi702grx3dq/2016-graves.pdf

2.6 深度強化學習

[45] 第一篇以深度強化學習為名的論文

★★★★

Mnih, Volodymyr, et al. "Playing atari with deep reinforcement learning." arXiv preprint arXiv:1312.5602 (2013).

http://arxiv.org/pdf/1312.5602.pdf

[46] 里程碑

★★★★★

Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529-533.

https://storage.googleapis.com/deepmind-data/assets/papers/DeepMindNature14236Paper.pdf


[47] ICLR最佳論文

★★★★

Wang, Ziyu, Nando de Freitas, and Marc Lanctot. "Dueling network architectures for deep reinforcement learning." arXiv preprint arXiv:1511.06581 (2015).

http://arxiv.org/pdf/1511.06581

[48] 當前最先進的深度強化學習方法

★★★★★

Mnih, Volodymyr, et al. "Asynchronous methods for deep reinforcement learning." arXiv preprint arXiv:1602.01783 (2016).

http://arxiv.org/pdf/1602.01783

[49] DDPG

★★★★

Lillicrap, Timothy P., et al. "Continuous control with deep reinforcement learning." arXiv preprint arXiv:1509.02971 (2015).

http://arxiv.org/pdf/1509.02971

[50] NAF

★★★★

Gu, Shixiang, et al. "Continuous Deep Q-Learning with Model-based Acceleration." arXiv preprint arXiv:1603.00748 (2016).

http://arxiv.org/pdf/1603.00748

[51] TRPO

★★★★

Schulman, John, et al. "Trust region policy optimization." CoRR, abs/1502.05477 (2015).

http://www.jmlr.org/proceedings/papers/v37/schulman15.pdf

[52] AlphaGo

★★★★★

Silver, David, et al. "Mastering the game of Go with deep neural networks and tree search." Nature 529.7587 (2016): 484-489.

http://willamette.edu/%7Elevenick/cs448/goNature.pdf

2.7 深度遷移學習 / 終生學習 / 強化學習

[53] Bengio教程

★★★

Bengio, Yoshua. "Deep Learning of Representations for Unsupervised and Transfer Learning." ICML Unsupervised and Transfer Learning 27 (2012): 17-36.

http://www.jmlr.org/proceedings/papers/v27/bengio12a/bengio12a.pdf

[54] 終生學習的簡單討論

★★★

Silver, Daniel L., Qiang Yang, and Lianghao Li. "Lifelong Machine Learning Systems: Beyond Learning Algorithms." AAAI Spring Symposium: Lifelong Machine Learning. 2013.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.696.7800&rep=rep1&type=pdf

[55] Hinton、Jeff Dean大神研究

★★★★

Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv preprint arXiv:1503.02531 (2015).

http://arxiv.org/pdf/1503.02531

[56] 強化學習策略

★★★

Rusu, Andrei A., et al. "Policy distillation." arXiv preprint arXiv:1511.06295 (2015).

http://arxiv.org/pdf/1511.06295


[57] 多工深度遷移強化學習

★★★

Parisotto, Emilio, Jimmy Lei Ba, and Ruslan Salakhutdinov. "Actor-mimic: Deep multitask and transfer reinforcement learning." arXiv preprint arXiv:1511.06342 (2015).

http://arxiv.org/pdf/1511.06342

[58] 累進式神經網路

★★★★★

Rusu, Andrei A., et al. "Progressive neural networks." arXiv preprint arXiv:1606.04671 (2016).

https://arxiv.org/pdf/1606.04671

2.8 一次性深度學習

[59] 不涉及深度學習,但值得一讀

★★★★★

Lake, Brenden M., Ruslan Salakhutdinov, and Joshua B. Tenenbaum. "Human-level concept learning through probabilistic program induction." Science 350.6266 (2015): 1332-1338.

http://clm.utexas.edu/compjclub/wp-content/uploads/2016/02/lake2015.pdf


[60] 一次性影象識別

★★★

Koch, Gregory, Richard Zemel, and Ruslan Salakhutdinov. "Siamese Neural Networks for One-shot Image Recognition."(2015)

http://www.cs.utoronto.ca/%7Egkoch/files/msc-thesis.pdf

[61] 一次性學習基礎

★★★★

Santoro, Adam, et al. "One-shot Learning with Memory-Augmented Neural Networks." arXiv preprint arXiv:1605.06065 (2016).

http://arxiv.org/pdf/1605.06065

[62] 一次性學習網路

★★★

Vinyals, Oriol, et al. "Matching Networks for One Shot Learning." arXiv preprint arXiv:1606.04080 (2016).

https://arxiv.org/pdf/1606.04080

[63] 大型資料

★★★★

Hariharan, Bharath, and Ross Girshick. "Low-shot visual object recognition." arXiv preprint arXiv:1606.02819 (2016).

http://arxiv.org/pdf/1606.02819

3. 應用

3.1 自然語言處理 (NLP)

[1]

★★★★

Antoine Bordes, et al. "Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing." AISTATS(2012)

https://www.hds.utc.fr/%7Ebordesan/dokuwiki/lib/exe/fetch.php?id=en%3Apubli&cache=cache&media=en:bordes12aistats.pdf

[2]

★★★

word2vec

Mikolov, et al. "Distributed representations of words and phrases and their compositionality." ANIPS(2013): 3111-3119

http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

[3]

★★★

Sutskever, et al. "Sequence to sequence learning with neural networks." ANIPS(2014)

http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf

[4]

★★★★

Ankit Kumar, et al. "Ask Me Anything: Dynamic Memory Networks for Natural Language Processing." arXiv preprint arXiv:1506.07285(2015)

https://arxiv.org/abs/1506.07285

[5]

★★★★

Yoon Kim, et al. "Character-Aware Neural Language Models." NIPS(2015) arXiv preprint arXiv:1508.06615(2015)

https://arxiv.org/abs/1508.06615

[6] bAbI任務

★★★

Jason Weston, et al. "Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks." arXiv preprint arXiv:1502.05698(2015)

https://arxiv.org/abs/1502.05698

[7] CNN / DailyMail 風格對比

★★

Karl Moritz Hermann, et al. "Teaching Machines to Read and Comprehend." arXiv preprint arXiv:1506.03340(2015)

https://arxiv.org/abs/1506.03340


[8] 當前最先進的文字分類

★★★

Alexis Conneau, et al. "Very Deep Convolutional Networks for Natural Language Processing." arXiv preprint arXiv:1606.01781(2016)

https://arxiv.org/abs/1606.01781


[9] 稍次於最先進方案,但速度快很多

★★★

Armand Joulin, et al. "Bag of Tricks for Efficient Text Classification." arXiv preprint arXiv:1607.01759(2016)

https://arxiv.org/abs/1607.01759

3.2 目標檢測

[1]

★★★

Szegedy, Christian, Alexander Toshev, and Dumitru Erhan. "Deep neural networks for object detection." Advances in Neural Information Processing Systems. 2013.

http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf

[2] RCNN

★★★★★

Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.

http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

[3] SPPNet

★★★

He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." European Conference on Computer Vision. Springer International Publishing, 2014.

http://arxiv.org/pdf/1406.4729

[4]

★★★

Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE International Conference on Computer Vision. 2015.

https://pdfs.semanticscholar.org/8f67/64a59f0d17081f2a2a9d06f4ed1cdea1a0ad.pdf

[5]

★★★

Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.

http://papers.nips.cc/paper/5638-analysis-of-variational-bayesian-latent-dirichlet-allocation-weaker-sparsity-than-map.pdf

[6] 相當實用的YOLO專案

★★★★★

Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." arXiv preprint arXiv:1506.02640 (2015).

http://homes.cs.washington.edu/%7Eali/papers/YOLO.pdf

[7]

★★★

Liu, Wei, et al. "SSD: Single Shot MultiBox Detector." arXiv preprint arXiv:1512.02325 (2015).

http://arxiv.org/pdf/1512.02325

[8]

★★★

Dai, Jifeng, et al. "R-FCN: Object Detection via Region-based Fully Convolutional Networks." arXiv preprint arXiv:1605.06409 (2016).

https://arxiv.org/abs/1605.06409

[9]

★★★

He, Gkioxari, et al. "Mask R-CNN" arXiv preprint arXiv:1703.06870 (2017).

https://arxiv.org/abs/1703.06870

3.3 視覺追蹤

[1] 第一份採用深度學習的視覺追蹤論文,DLT追蹤器

★★★

Wang, Naiyan, and Dit-Yan Yeung. "Learning a deep compact image representation for visual tracking." Advances in neural information processing systems. 2013.

http://papers.nips.cc/paper/5192-learning-a-deep-compact-image-representation-for-visual-tracking.pdf


[2] SO-DLT

★★★

Wang, Naiyan, et al. "Transferring rich feature hierarchies for robust visual tracking." arXiv preprint arXiv:1501.04587 (2015).

http://arxiv.org/pdf/1501.04587

[3] FCNT

★★★

Wang, Lijun, et al. "Visual tracking with fully convolutional networks." Proceedings of the IEEE International Conference on Computer Vision. 2015.

http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Wang_Visual_Tracking_With_ICCV_2015_paper.pdf

[4] 跟深度學習一樣快的非深度學習方法,GOTURN

★★★

Held, David, Sebastian Thrun, and Silvio Savarese. "Learning to Track at 100 FPS with Deep Regression Networks." arXiv preprint arXiv:1604.01802 (2016).

http://arxiv.org/pdf/1604.01802

[5] 新的最先進的實時目標追蹤方案 SiameseFC

★★★

Bertinetto, Luca, et al. "Fully-Convolutional Siamese Networks for Object Tracking." arXiv preprint arXiv:1606.09549 (2016).

https://arxiv.org/pdf/1606.09549

[6] C-COT

★★★

Martin Danelljan, Andreas Robinson, Fahad Khan, Michael Felsberg. "Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking." ECCV (2016)

http://www.cvl.isy.liu.se/research/objrec/visualtracking/conttrack/C-COT_ECCV16.pdf

[7] VOT2016大賽冠軍 TCNN

★★★

Nam, Hyeonseob, Mooyeol Baek, and Bohyung Han. "Modeling and Propagating CNNs in a Tree Structure for Visual Tracking." arXiv preprint arXiv:1608.07242 (2016).

https://arxiv.org/pdf/1608.07242

3.4 影象標註

[1]

★★★

Farhadi,Ali,etal. "Every picture tells a story: Generating sentences from images". In Computer VisionECCV 201match0. Spmatchringer Berlin Heidelberg:15-29, 2010.

https://www.cs.cmu.edu/%7Eafarhadi/papers/sentence.pdf

[2]

★★★

Kulkarni, Girish, et al. "Baby talk: Understanding and generating image descriptions". In Proceedings of the 24th CVPR, 2011.

http://tamaraberg.com/papers/generation_cvpr11.pdf

[3]

★★★

Vinyals, Oriol, et al. "Show and tell: A neural image caption generator". In arXiv preprint arXiv:1411.4555, 2014.

https://arxiv.org/pdf/1411.4555.pdf

[4] RNN視覺識別與標註

Donahue, Jeff, et al. "Long-term recurrent convolutional networks for visual recognition and description". In arXiv preprint arXiv:1411.4389 ,2014.

https://arxiv.org/pdf/1411.4389.pdf

[5] 李飛飛及高徒Andrej Karpathy

★★★★★

Karpathy, Andrej, and Li Fei-Fei. "Deep visual-semantic alignments for generating image descriptions". In arXiv preprint arXiv:1412.2306, 2014.

https://cs.stanford.edu/people/karpathy/cvpr2015.pdf

[6] 李飛飛及高徒Andrej Karpathy

★★★

Karpathy, Andrej, Armand Joulin, and Fei Fei F. Li. "Deep fragment embeddings for bidirectional image sentence mapping". In Advances in neural information processing systems, 2014.

https://arxiv.org/pdf/1406.5679v1.pdf

[7]

★★★

Fang, Hao, et al. "From captions to visual concepts and back". In arXiv preprint arXiv:1411.4952, 2014.

https://arxiv.org/pdf/1411.4952v3.pdf

[8]

★★★

Chen, Xinlei, and C. Lawrence Zitnick. "Learning a recurrent visual representation for image caption generation". In arXiv preprint arXiv:1411.5654, 2014.

https://arxiv.org/pdf/1411.5654v1.pdf

[9]

★★★

Mao, Junhua, et al. "Deep captioning with multimodal recurrent neural networks (m-rnn)". In arXiv preprint arXiv:1412.6632, 2014.

https://arxiv.org/pdf/1412.6632v5.pdf

[10]

★★★★★

Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention". In arXiv preprint arXiv:1502.03044, 2015.

https://arxiv.org/pdf/1502.03044v3.pdf

3.5 機器翻譯

本話題的部分里程碑論文列在 2.4 “RNN / 序列到序列模型”話題下。

[1]

★★★

Luong, Minh-Thang, et al. "Addressing the rare word problem in neural machine translation." arXiv preprint arXiv:1410.8206 (2014).

http://arxiv.org/pdf/1410.8206

[2]

★★★

Sennrich, et al. "Neural Machine Translation of Rare Words with Subword Units". In arXiv preprint arXiv:1508.07909, 2015.

https://arxiv.org/pdf/1508.07909.pdf

[3]

★★★

Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. "Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015).

http://arxiv.org/pdf/1508.04025

[4]

★★

Chung, et al. "A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation". In arXiv preprint arXiv:1603.06147, 2016.

https://arxiv.org/pdf/1603.06147.pdf

[5]

★★★★★

Lee, et al. "Fully Character-Level Neural Machine Translation without Explicit Segmentation". In arXiv preprint arXiv:1610.03017, 2016.

https://arxiv.org/pdf/1610.03017.pdf


[6] 里程碑

★★★

Wu, Schuster, Chen, Le, et al. "Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation". In arXiv preprint arXiv:1609.08144v2, 2016.

https://arxiv.org/pdf/1609.08144v2.pdf

3.6 機器人

[1]

★★★

Koutník, Jan, et al. "Evolving large-scale neural networks for vision-based reinforcement learning." Proceedings of the 15th annual conference on Genetic and evolutionary computation. ACM, 2013.

http://repository.supsi.ch/4550/1/koutnik2013gecco.pdf

[2]

★★★★★

Levine, Sergey, et al. "End-to-end training of deep visuomotor policies." Journal of Machine Learning Research 17.39 (2016): 1-40.

http://www.jmlr.org/papers/volume17/15-522/15-522.pdf

[3]

★★★

Pinto, Lerrel, and Abhinav Gupta. "Supersizing self-supervision: Learning to grasp from 50k tries and 700 robot hours." arXiv preprint arXiv:1509.06825 (2015).

http://arxiv.org/pdf/1509.06825

[4]

★★★

Levine, Sergey, et al. "Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection." arXiv preprint arXiv:1603.02199 (2016).

http://arxiv.org/pdf/1603.02199

[5]

★★★

Zhu, Yuke, et al. "Target-driven Visual Navigation in Indoor Scenes using Deep Reinforcement Learning." arXiv preprint arXiv:1609.05143 (2016).

https://arxiv.org/pdf/1609.05143


[6]

★★★

Yahya, Ali, et al. "Collective Robot Reinforcement Learning with Distributed Asynchronous Guided Policy Search." arXiv preprint arXiv:1610.00673 (2016).

https://arxiv.org/pdf/1610.00673

[7]

★★★

Gu, Shixiang, et al. "Deep Reinforcement Learning for Robotic Manipulation." arXiv preprint arXiv:1610.00633 (2016).

https://arxiv.org/pdf/1610.00633

[8]

★★★

A Rusu, M Vecerik, Thomas Rothörl, N Heess, R Pascanu, R Hadsell."Sim-to-Real Robot Learning from Pixels with Progressive Nets." arXiv preprint arXiv:1610.04286 (2016).

https://arxiv.org/pdf/1610.04286.pdf

[9]

★★★

Mirowski, Piotr, et al. "Learning to navigate in complex environments." arXiv preprint arXiv:1611.03673 (2016).

https://arxiv.org/pdf/1611.03673

3.7 藝術

[1] Google Deep Dream

★★★

Mordvintsev, Alexander; Olah, Christopher; Tyka, Mike (2015). "Inceptionism: Going Deeper into Neural Networks". Google Research.

https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html

[2] 當前最為成功的藝術風格遷移方案,Prisma

★★★★★

Gatys, Leon A., Alexander S. Ecker, and Matthias Bethge. "A neural algorithm of artistic style." arXiv preprint arXiv:1508.06576 (2015).

http://arxiv.org/pdf/1508.06576

[3] iGAN★★★

Zhu, Jun-Yan, et al. "Generative Visual Manipulation on the Natural Image Manifold." European Conference on Computer Vision. Springer International Publishing, 2016.

https://arxiv.org/pdf/1609.03552

[4] Neural Doodle

★★★

Champandard, Alex J. "Semantic Style Transfer and Turning Two-Bit Doodles into Fine Artworks." arXiv preprint arXiv:1603.01768 (2016).

http://arxiv.org/pdf/1603.01768

[5]

★★★

Zhang, Richard, Phillip Isola, and Alexei A. Efros. "Colorful Image Colorization." arXiv preprint arXiv:1603.08511 (2016).

http://arxiv.org/pdf/1603.08511

[6] 超解析度,李飛飛

★★★

Johnson, Justin, Alexandre Alahi, and Li Fei-Fei. "Perceptual losses for real-time style transfer and super-resolution." arXiv preprint arXiv:1603.08155 (2016).

https://arxiv.org/pdf/1603.08155.pdf

[7]

★★★

Vincent Dumoulin, Jonathon Shlens and Manjunath Kudlur. "A learned representation for artistic style." arXiv preprint arXiv:1610.07629 (2016).

https://arxiv.org/pdf/1610.07629v1.pdf

[8] 基於空間位置、色彩資訊與空間尺度的風格遷移

★★★

Gatys, Leon and Ecker, et al."Controlling Perceptual Factors in Neural Style Transfer." arXiv preprint arXiv:1611.07865 (2016).

https://arxiv.org/pdf/1611.07865.pdf

[9] 紋理生成與風格遷移

★★★

Ulyanov, Dmitry and Lebedev, Vadim, et al. "Texture Networks: Feed-forward Synthesis of Textures and Stylized Images." arXiv preprint arXiv:1603.03417(2016).

http://arxiv.org/abs/1603.03417

3.8 目標分割

[1]

★★★★★

J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation.” in CVPR, 2015.

https://arxiv.org/pdf/1411.4038v2.pdf

[2]

★★★★★

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. "Semantic image segmentation with deep convolutional nets and fully connected crfs." In ICLR, 2015.

https://arxiv.org/pdf/1606.00915v1.pdf

[3]

★★★

Pinheiro, P.O., Collobert, R., Dollar, P. "Learning to segment object candidates." In: NIPS. 2015.

https://arxiv.org/pdf/1506.06204v2.pdf

[4]

★★★

Dai, J., He, K., Sun, J. "Instance-aware semantic segmentation via multi-task network cascades." in CVPR. 2016

https://arxiv.org/pdf/1512.04412v1.pdf

[5]

★★★

Dai, J., He, K., Sun, J. "Instance-sensitive Fully Convolutional Networks." arXiv preprint arXiv:1603.08678 (2016).

https://arxiv.org/pdf/1603.08678v1.pdf



相關文章