NLP Tutorials — Part 20: Compressive Transformer

This particular architecture has a lower memory requirement than Vanilla Transformer and is similar to the Transformer-XL that models longer sequences efficiently. The below image depicts how the memory is compressed. We can also say that this is drawing some parallels to the human brain — We have a brilliant memory because of the power of compressing and storing information very intelligently. This sure seems interesting, doesn’t it?

NLP Tutorials — Part 19: Longformer: Long Document Transformer

In this article, we will be discussing Longformer, which overcomes one of the famous pitfalls of transformers — the inability to process long sequences because of its quadratic scaling with increase in the sequence length. The Longformer is a vanilla transformer with a change in the attention mechanism, which is a combination of local self-attention and a global attention. 

NLP Tutorials — Part 13: BERT

Welcoming you to an article on BERT. Yes, you heard it right! What a journey we have had starting right from the basics all the way till BERT. Finally we are at the proficiency required to understand one of the highly capable models on a variety of NLP tasks like Text Classification, Question Answering, Named Entity Recognition with very little training. Bidirectional Encoder Representations from Transformers or BERT is a semi-supervised language model trained on huge corpus of data and then fine-tuned on custom data to achieve SOTA results. Without wasting much time let’s jump straight into the technicalities of BERT.

NLP Tutorials — Part 11: Transformers

Hello all and welcome back to yet another interesting concept which has time and again proven as one of the best methods to solve major NLP problems with State-of-the-Art accuracy which are near human in performance! That architecture is known as the “Transformers”. The important gain by Transformers was to enable parallelization which wasn’t on offer in the previous model we saw — “Seq2Seq”. In this blog, we shall navigate through the Transformer architecture in detail and understand why it is the breakthrough architecture in recent years.