NLP Tutorials — Part 26: Infinite Transformer

Hello and welcome back to the NLP Tutorials series. In the last two articles, we have looked into a few applications in the NLP domain ,  NER and Summarization, which can be found in various real-world settings. In this article, we shall get back to the understanding one of the latest and most interesting architecture s -  Infinite-former aka Infinite Transformer! The paper vouches for an attention mechanism which can cater to unbounded long-term contextual memory, which is groundbreaking since many architectures have tried solving the complexity and memory constraints of vanilla transformers albeit with a limited memory (longer sequences but limited). The authors also introduced sticky memories which are able to model very long contexts with a fixed computational requirement. 

NLP Tutorials — Part 25: Text Summarization

Welcome back to another article in the NLP Tutorials series! Continuing our quest towards mastery in NLP, we will be looking at an exciting application in NLP — Text Summarization. In the current times where data is being generated at a massive scale, at times we want to shrink them and have an overview only than the entire length. This is where text summarization plays a key role in condensing the document/data into a concise form. It is a challenging problem to solve since it depends on the cognitive intellect, language understanding and domain knowledge. In this article, we shall have a brief overview on the types of Text Summarization and attempt to implement a basic model ourselves using the NLP concepts and libraries we have come across so far.

NLP Tutorials — Part 23: Fastformer: Additive Attention Can Be All You Need

Hello and welcome back to an article where we are going to discuss an architecture that had mixed impressions. Some called it brilliant and some of them said, “Nah, this ain’t no transformer!” The architecture I’m talking about is the Fastformer: Additive Attention can be all you need. As we all know by now, Transformers are quite inefficient while scaling up and we have seen a plethora of architectures that claim to mitigate this in their own ways.

NLP Tutorials — Part 21: Linformer: Self-attention with Linear Complexity 

Transformer has been a breakthrough architecture which has fared excellently in both NLP and Computer Vision and learning about these kinds of architectures is always beneficial in the long run. I promise a hands-on exercise in our next article wherein we will use various architectures, pick a dataset and observe the performance. For now, let’s get on with Linformer.