Skip to main content

Posts

Showing posts from September, 2022

El ministeri del futur

Recentment, he llegit el Ministeri del futur, una novel·la de ciència-ficció sobre la probable evolució del canvi climàtic i com la humanitat podria convertir el camí fatal actual en un futur pròsper per a les properes generacions. Em va sorprendre el principi, ja que l'autor proposa un punt d'inflexió tal com jo també el veig. És probable que no vegem canvis estructurals fins que passi alguna cosa realment dolenta. I això és exactament el que va passar al llibre. Una combinació mortal de calor i humitat fa que la sudoració sigui inútil i milions de persones van morir a conseqüència d'això a l'Índia cap al 2025.  Després del terrible esdeveniment, l'Índia pren el volant i decideix utilitzar la geoenginyeria , ja que el canvi internacional no s'estava produint durant massa temps.Aquesta rèplica de l'alliberament de productes químics d'un volcà redueix temporalment la temperatura, amb conseqüències desconegudes, especialment en els patrons de pluja.  De

FastAI deeplearning Part 14.2: Resnets Deep Dive

 In the previous  post , I cover in detail convolutional neural networks and the importance of batchnorm in order to achieve stable training process and faster learning. In the following, I will deep dive intothat issue:   "In some convolutional net implementations, deeper models have lead to higher training and validation error, despite of the theoretical possibility of being at least as good as a shalower model"  To overcome this,  the authors of this paper  , propose a solution that is since 2015 widely use in any convolutional neural network, which is the concept of residual nets or resnets, and the skip connections. Let's get started! Resnets Resnets appear after the observation that some deeper models where providing worse training and test error, which is not expected. The authors of the Resnet paper show how even if the two networks have the same weights on the same layers, the addition of other layers and its update through SGD lead to lower performance, which

FastAI deeplearning Part 14.1: Convolutions Deep Dive

 In the following post, I summarize the key ideas from the fastAI course while doing a deep dives into Convolutions (in 14.1) and Resnets (in 14.2). To complement the content when needed I use the Deep Learning Book from Ian Goodfellow et al. Let's get started! Kernels and Convolutions In computer vision problems, before the usage of deep learning, one had to encode almost explicitely which features one want the computer to learn, such as verticat edges. For that purpose, a key operation was the convolution, where a kernel (a matrix aimed to search for an edge for example) was multiplied to the original pixel values in order to create the so called feature map. As one can see from the image, the kernel is detecting where significant chages on the pixel values happened to detect the edges. Padding and outputs of convolutions One thing to note is that the output size after applying the kernel will change to a smaller size , and how much that happens, depends on the kernel size, the s

FastAI deeplearning Part 13.2: NLP deep dive, implementing poetry model from scratch

After getting a theoretical understanding of RNNs and LSTMS in the previous post , we review the theoretical foundations of RNNs, its challenges, and contributions from LSTMs, a foundational work for Transformers and Attention models, which can be implemented with  Hugging Face Tutorial .  Although one would probable, and should not as shown is this post, train from scratch RNNs or LSTMs if the goal is to reach close to the state of the art quality results, understanding the foundations and innovations that has been implemented from a shallow RNN to a LSTM with regularization and dropout make the understanding of attention models architecture easier. In this post we try to implement from scratch a poetry model after learning from all the bibliography of the great spanish poet Garcia Lorca, killed by the fascists after the civil war. He is without any doubt, one of the most remarkable poets in spanish history and the XX century, this is my humble homenage to him. First part - loading th

FastAI deeplearning Part 13.1: NLP deep dive, review of RNNs theory

Due to the complexity of RNNs and my impression that the fastai did no cover sufficient formalities (which is normal in a single notebook), I will start with the chapter on RNNs and LSTMs from the book Deep Learning (Ian Godfellow et al) to consolidate the concepts before we jump into the code and case study. Brief theory on RNNs ( source book ) Recurrent neural networks are particularly useful for sequence data, such as text or in general any time series data. One of the first ideas that allow the use of deep and large recurrent neural networks is the fact that parameters are shared across several time steps, allowing for generalization, independent of the exact position of the observation. Our computational graph (set of computations) will include this time cycles, where the present value is going to affect what will be expected as a realization in the future. More formally: s(t) = f(s(t-1), theta) = f(f(s(t-2);theta);theta) ... (1) This can be unfold for many periods creating the c