Hello all and welcome back to yet another interesting concept which has time and again proven as one of the best methods to solve major NLP problems with State-of-the-Art accuracy which are near human in performance! That architecture is known as the “Transformers”. The important gain by Transformers was to enable parallelization which wasn’t on offer in the previous model we saw — “Seq2Seq”. In this blog, we shall navigate through the Transformer architecture in detail and understand why it is the breakthrough architecture in recent years.
Warm welcome to another interesting article in the NLP Tutorials series. In this article we will try to understand an architecture which forms the base for advanced models like Attention, Transformers, GPT, and BERT. This is widely used in machine and language translation tasks. The encoder will encode the input to a fixed-length internal representation which is then taken by the decoder to output words in another form/language. Nowadays, we are seeing multi-modal tasks being performed using a single model i.e Text translation from English to French, Spanish and German language using a single model! Since the input and output are always text in the form of sequences, this architecture is popularly known as Seq2Seq.
Hola! Welcome back to the follow-up article on LSTMs. In this article we shall discuss 2 more architectures which are very similar to LSTMs. They are Bi-LSTMs and GRUs (Gated Recurrent Units). As we saw in our previous article, the LSTM was able to solve most problems of vanilla RNNs and solve a few important NLP problems easily with good data. The Bi-LSTM and GRU can be treated as architectures which have evolved from LSTMs. The core idea will be the same with a few improvements here and there.
Hi and welcome back to yet another article in the NLP Tutorials series. So far, we have covered a few important concepts, architectures and projects which are important in the NLP domain, the latest being Recurrent Neural Networks (RNNs). Time to move a step ahead and understand about an architecture which is advanced and performs excellently over RNNs. You heard it right — Long Short Term Memory (LSTMs) networks. Without wasting time, let us first go through a few disadvantages of RNNs and what did not work for them, which in turn will set the context right for understanding LSTM and how it was able to solve the problems of RNNs.
Welcome back to another exciting article in the NLP Tutorials series. It's time to fully delve into Deep Learning for NLP! In NLP, it is very important that we remember things and retain the context very well. For example, as humans we learn progressively, word by word, sentence by sentence, the way you are doing now. We understand things by reading and thinking progressively and we need to have a memory to retain things and maintain the context for that particular task.