Welcome back to yet another interesting article in the NLP Tutorials series wherein we will be advancing our proficiency from a Beginner to an expert in NLP. In this blog, we will be looking at an architecture which took the industry by storm. That’s right, it's the GPT (Generative Pre Training)! The GPT was published by OpenAI in 2018 and achieved an incredible state of the art performance in the majority of the popular NLP tasks. GPT is a way of training language models and comes under the category of semi-supervised learning. This means, it is trained on unlabeled text data and then fine-tuned on supervised (labelled) data for the downstream NLP tasks. Let’s dig deep and understand GPT in detail.
Hello and welcome back to the NLP tutorials series where we inspire you to go through the ranks of NLP expertise all the way to Expert. If you follow all our articles in this tutorial series, no doubt you will gain valuable technical knowledge in the NLP domain. In our previous articles, we had an in-depth look at BERT and one of its improvements (accuracy). In this article we shall rather address a huge problem coming our way — the computational requirement for training these massive language models is going out of hands.
Hello and welcome back to yet another interesting article in the NLP tutorials series! We are here to explore a model which is an improvement over the massively famous NLP language model — BERT. Robustly Optimized BERT Pretraining approach or RoBERTa performs a good 15–20% better than BERT due to careful hyperparameter tuning and bigger datasets. The authors thought that the BERT is very under-trained and if given more data with hyperparameter tuning, its full potential of performance can be achieved. Let’s quickly get started and understand how the authors were able to achieve the performance bump over conventional BERT
Welcoming you to an article on BERT. Yes, you heard it right! What a journey we have had starting right from the basics all the way till BERT. Finally we are at the proficiency required to understand one of the highly capable models on a variety of NLP tasks like Text Classification, Question Answering, Named Entity Recognition with very little training. Bidirectional Encoder Representations from Transformers or BERT is a semi-supervised language model trained on huge corpus of data and then fine-tuned on custom data to achieve SOTA results. Without wasting much time let’s jump straight into the technicalities of BERT.
Welcome back to yet another interesting article in our NLP Tutorials series. In this article we will be talking about Transformer-XL which outperformed the Vanilla Transformer (Attention is All You Need) in accuracy metrics and handling long-term context dependencies which we often see in real world tasks.