标签:
使用卷积神经网络进行图片理解一段,从二十一世纪开始,卷积神经网络就成功运用在了检测,切割和识别上面。这通常是在一些拥有大量标注数据的领域中得到了充分的应用。
像素级的识别能够运用在自动机器人,自动驾驶汽车等诸多领域。其他的领域包括语音识别和自然语言的理解。
直到12年之前,cnn都没有活起来,但是alexnet使得一切变成可能。最近的研究成果是一个图像识别的cnn和语言处理的rnn连接起来产生对于图片的描述。
包含大量参数的网络随着软件和硬件的提升使得训练时间从几个礼拜减少到几个小时。
而且由于其效果,使得工业界也开始发力。诸多企业进行研究,同时由于器容易使现在fpga,即可编程门阵列上,英伟达、高通等公司进行了相关研究。
交通信号识别
53. Ciresan, D., Meier, U. Masci, J. & Schmidhuber, J. Multi-column deep neural network for traffic sign classification. Neural Networks 32, 333–338 (2012).
生物图像切割
54. Ning, F. et al. Toward automatic phenotyping of developing embryos from videos. IEEE Trans. Image Process. 14, 1360–1371 (2005).
神经连接
55. Turaga, S. C. et al. Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Comput. 22, 511–538 (2010).
面部检测、行人检测、躯干检测等
36. Sermanet, P., Kavukcuoglu, K., Chintala, S. & LeCun, Y. Pedestrian detection with unsupervised multi-stage feature learning. In Proc. International Conference on Computer Vision and Pattern Recognition http://arxiv.org/abs/1212.0142 (2013).
50. Vaillant, R., Monrocq, C. & LeCun, Y. Original approach for the localisation of objects in images. In Proc. Vision, Image, and Signal Processing 141, 245–250(1994).
51. Nowlan, S. & Platt, J. in Neural Information Processing Systems 901–908 (1995).
56. Garcia, C. & Delakis, M. Convolutional face finder: a neural architecture for fast and robust face detection. IEEE Trans. Pattern Anal. Machine Intell. 26,1408–1423 (2004).
57. Osadchy, M., LeCun, Y. & Miller, M. Synergistic face detection and pose estimation with energy-based models. J. Mach. Learn. Res. 8, 1197–1215 (2007).
58. Tompson, J., Goroshin, R. R., Jain, A., LeCun, Y. Y. & Bregler, C. C. Efficient object localization using convolutional networks. In Proc. Conference on Computer Vision and Pattern Recognition http://arxiv.org/abs/1411.4280 (2014).
面部识别
59. Taigman, Y., Yang, M., Ranzato, M. & Wolf, L. Deepface: closing the gap to human-level performance in face verification. In Proc. Conference on Computer Vision and Pattern Recognition 1701–1708 (2014).
使用cnn的自动驾驶汽车
60. Hadsell, R. et al. Learning long-range vision for autonomous off-road driving. J. Field Robot. 26, 120–144 (2009).
61. Farabet, C., Couprie, C., Najman, L. & LeCun, Y. Scene parsing with multiscale feature learning, purity trees, and optimal covers. In Proc. International Conference on Machine Learning http://arxiv.org/abs/1202.2160 (2012).
自然语言理解
14. Collobert, R., et al. Natural language processing (almost) from scratch. J. Mach.Learn. Res. 12, 2493–2537 (2011).
语音识别
7. Sainath, T., Mohamed, A.-R., Kingsbury, B. & Ramabhadran, B. Deepconvolutional neural networks for LVCSR. In Proc. Acoustics, Speech and SignalProcessing 8614–8618 (2013).
alexnet
1. Krizhevsky, A., Sutskever, I. & Hinton, G. ImageNet classification with deepconvolutional neural networks. In Proc. Advances in Neural InformationProcessing Systems 25 1090–1098 (2012).This report was a breakthrough that used convolutional nets to almost halvethe error rate for object recognition, and precipitated the rapid adoption ofdeep learning by the computer vision community.
dropout
62. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Machine Learning Res. 15, 1929–1958 (2014).
识别和检测
4. Szegedy, C. et al. Going deeper with convolutions. Preprint at http://arxiv.org/abs/1409.4842 (2014).
58. Tompson, J., Goroshin, R. R., Jain, A., LeCun, Y. Y. & Bregler, C. C. Efficient object localization using convolutional networks. In Proc. Conference on Computer Vision and Pattern Recognition http://arxiv.org/abs/1411.4280 (2014).
59. Taigman, Y., Yang, M., Ranzato, M. & Wolf, L. Deepface: closing the gap to human-level performance in face verification. In Proc. Conference on Computer Vision and Pattern Recognition 1701–1708 (2014).
63. Sermanet, P. et al. Overfeat: integrated recognition, localization and detection using convolutional networks. In Proc. International Conference on Learning Representations http://arxiv.org/abs/1312.6229 (2014).
64. Girshick, R., Donahue, J., Darrell, T. & Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proc. Conference on Computer Vision and Pattern Recognition 580–587 (2014).
65. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. In Proc. International Conference on Learning Representations http://arxiv.org/abs/1409.1556 (2014).
fpga与cnn
66. Boser, B., Sackinger, E., Bromley, J., LeCun, Y. & Jackel, L. An analog neural network processor with programmable topology. J. Solid State Circuits 26, 2017–2025 (1991).
67. Farabet, C. et al. Large-scale FPGA-based convolutional networks. In Scaling up Machine Learning: Parallel and Distributed Approaches (eds Bekkerman, R., Bilenko, M. & Langford, J.) 399–419 (Cambridge Univ. Press, 2011).
特征表达和语言处理
深度学习的理论比哪些不用distributed representations好。这依赖于那些隐藏在数据分布之下的规律和结构。
多个隐藏层能够更好的通过局部输入来预测输出。每一个词作为一个n维向量,只有一个1,其他都是0.第一层,每个词语都会形成一个不同的激活模式,或者说是词语的向量。在一个语言模型中,其他的卷积神经网络的层会学习去把输入向量转换成为预测词语的输出向量。网络会学习包含多个激活组成了一个词语的不同特征,作为第一个维度。这些语义上的特征并不能精确的在输入中表达,它们能够在学习中进行一些输入到输出的微小的规则的表达。学习词语向量对于特征的表达,随着文本数据量的增大变得效果越来越好。
在神经网络应用于语言之前,标准的使用统计的方法并没有distributed representations,而是基于长度维n的频率的统计。这需要大量的训练数据,所以不能产生泛化的相关的词语序列。
distributed representations
21. Bengio, Y., Delalleau, O. & Le Roux, N. The curse of highly variable functions for local kernel machines. In Proc. Advances in Neural Information Processing Systems 18 107–114 (2005).
数据分布下的整体架构
40. Bengio, Y., Courville, A. & Vincent, P. Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Machine Intell. 35, 1798–1828 (2013).
distributed representations增强泛化能力
68. Bengio, Y. Learning Deep Architectures for AI (Now, 2009).
69. Montufar, G. & Morton, J. When does a mixture of products contain a product of mixtures? J. Discrete Math. 29, 321–347 (2014).
深度增强表达能力
70. Montufar, G. F., Pascanu, R., Cho, K. & Bengio, Y. On the number of linear regions of deep neural networks. In Proc. Advances in Neural Information Processing Systems 27 2924–2932 (2014).
通过局部输入确定下一个输出
71. Bengio, Y., Ducharme, R. & Vincent, P. A neural probabilistic language model. In Proc. Advances in Neural Information Processing Systems 13 932–938 (2001). This paper introduced neural language models, which learn to convert a word symbol into a word vector or word embedding composed of learned semantic features in order to predict the next word in a sequence.
网络根据词语学习不同的激活特征
27. Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).
大量的语料库使得单独的规则可信度较低,需要多个规则
71. Bengio, Y., Ducharme, R. & Vincent, P. A neural probabilistic language model. In Proc. Advances in Neural Information Processing Systems 13 932–938 (2001). This paper introduced neural language models, which learn to convert a word symbol into a word vector or word embedding composed of learned semantic features in order to predict the next word in a sequence.
文本的向量化的表达
14. Collobert, R., et al. Natural language processing (almost) from scratch. J. Mach.Learn. Res. 12, 2493–2537 (2011).
17. Sutskever, I. Vinyals, O. & Le. Q. V. Sequence to sequence
learning with neuralnetworks. In Proc. Advances in Neural Information
Processing Systems 273104–3112 (2014).This paper showed state-of-the-art machine
translation results with thearchitecture introduced in ref. 72, with a
recurrent network trained to read asentence in on
72. Cho, K. et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. Conference on Empirical Methods in Natural Language Processing 1724–1734 (2014).
73. Schwenk, H. Continuous space language models. Computer Speech Lang. 21, 492–518 (2007).
74. Socher, R., Lin, C. C-Y., Manning, C. & Ng, A. Y. Parsing natural scenes and natural language with recursive neural networks. In Proc. International Conference on Machine Learning 129–136 (2011).
75. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. & Dean, J. Distributed representations of words and phrases and their compositionality. In Proc. Advances in Neural Information Processing Systems 26 3111–3119 (2013).
76. Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. In Proc. International Conference on Learning Representations http://arxiv.org/abs/1409.0473 (2015).
神经网络的语言模型
71. Bengio, Y., Ducharme, R. & Vincent, P. A neural probabilistic language model. In Proc. Advances in Neural Information Processing Systems 13 932–938 (2001). This paper introduced neural language models, which learn to convert a word symbol into a word vector or word embedding composed of learned semantic features in order to predict the next word in a sequence.2016.4.12 nature deep learning review[2]
标签:
原文地址:http://blog.csdn.net/zhaohui1995_yang/article/details/51346722