Introduction to Deep Learning – Advanced Millennium Technologies

Deep learningÂ (also known asÂ deep structured learningÂ orÂ hierarchical learning) is part of a broader family ofÂ machine learningÂ methods based on artificial neural networks. Learning can beÂ supervised,Â semi-supervisedÂ orÂ unsupervised.

Deep learning architectures such asÂ deep neural networks,Â deep belief networks,Â recurrent neural networksÂ andÂ convolutional neural networksÂ have been applied to fields includingÂ computer vision,Â speech recognition,Â natural language processing, audio recognition, social network filtering,Â machine translation,Â bioinformatics,Â drug design, medical image analysis, material inspection andÂ board gameÂ programs, where they have produced results comparable to and in some cases superior to human experts.

Artificial Neural NetworksÂ (ANNs) were inspired by information processing and distributed communication nodes in biological systems. ANNs have various differences from biologicalÂ brains. Specifically, neural networks tend to be static and symbolic, while the biological brain of most living organisms is dynamic (plastic) and analog.

Deep neural networks are generally interpreted in terms of theÂ universal approximation theoremÂ orÂ probabilistic inference.

The classic universal approximation theorem concerns the capacity ofÂ feedforward neural networksÂ with a single hidden layer of finite size to approximateÂ continuous functions.Â In 1989, the first proof was published byÂ George CybenkoÂ forÂ sigmoidÂ activation functionsÂ and was generalised to feed-forward multi-layer architectures in 1991 by Kurt Hornik.Â Recent work also showed that universal approximation also holds for non-bounded activation functions such as the rectified linear unit.

The universal approximation theorem forÂ deep neural networksÂ concerns the capacity of networks with bounded width but the depth is allowed to grow. Lu et al.Â proved that if the width of aÂ deep neural networkÂ withÂ ReLUÂ activation is strictly larger than the input dimension, then the network can approximate anyÂ Lebesgue integrable function; If the width is smaller or equal to the input dimension, thenÂ deep neural networkÂ is not a universal approximator.

TheÂ probabilisticÂ interpretationÂ derives from the field ofÂ machine learning. It features inference, as well as theÂ optimizationÂ concepts ofÂ trainingÂ andÂ testing, related to fitting andÂ generalization, respectively. More specifically, the probabilistic interpretation considers the activation nonlinearity as aÂ cumulative distribution function.Â The probabilistic interpretation led to the introduction ofÂ dropoutÂ asÂ regularizerÂ in neural networks.Â The probabilistic interpretation was introduced by researchers includingÂ Hopfield,Â WidrowÂ andÂ NarendraÂ and popularized in surveys such as the one byÂ Bishop.

Deep learning revolution:

In 2012, a team led by George E. Dahl won the “Merck Molecular Activity Challenge” using multi-task deep neural networks to predict theÂ biomolecular targetÂ of one drug.Â In 2014, Hochreiter’s group used deep learning to detect off-target and toxic effects of environmental chemicals in nutrients, household products and drugs and won the “Tox21 Data Challenge” ofÂ NIH,Â FDAÂ andÂ NCATS.

Significant additional impacts in image or object recognition were felt from 2011 to 2012. Although CNNs trained by backpropagation had been around for decades, and GPU implementations of NNs for years, including CNNs, fast implementations of CNNs with max-pooling on GPUs in the style of Ciresan and colleagues were needed to progress on computer vision. In 2011, this approach achieved for the first time superhuman performance in a visual pattern recognition contest. Also in 2011, it won the ICDAR Chinese handwriting contest, and in May 2012, it won the ISBI image segmentation contest.Â Until 2011, CNNs did not play a major role at computer vision conferences, but in June 2012, a paper by Ciresan et al. at the leading conference CVPRÂ showed how max-pooling CNNs on GPU can dramatically improve many vision benchmark records. In October 2012, a similar system by Krizhevsky et al.Â won the large-scaleÂ ImageNet competitionÂ by a significant margin over shallow machine learning methods. In November 2012, Ciresan et al.’s system also won the ICPR contest on analysis of large medical images for cancer detection, and in the following year also the MICCAI Grand Challenge on the same topic.Â In 2013 and 2014, the error rate on the ImageNet task using deep learning was further reduced, following a similar trend in large-scale speech recognition. TheÂ WolframÂ Image Identification project publicized these improvements.

Image classification was then extended to the more challenging task of generating descriptions (captions) for images, often as a combination of CNNs and LSTMs.

Some researchers assess that the October 2012 ImageNet victory anchored the start of a “deep learning revolution” that has transformed the AI industry.

In March 2019,Â Yoshua Bengio,Â Geoffrey HintonÂ andÂ Yann LeCunÂ were awarded theÂ Turing AwardÂ for conceptual and engineering breakthroughs that have made deep neural networks a critical component of computing.

The above is a brief about Deep Learning. Watch this space for more updates on the latest trends in Technology.

Leave a Reply Cancel reply