Warm welcome to another interesting article in the NLP Tutorials series. In this article we will try to understand an architecture which forms the base for advanced models like Attention, Transformers, GPT, and BERT. This is widely used in machine and language translation tasks. The encoder will encode the input to a fixed-length internal representation which is then taken by the decoder to output words in another form/language. Nowadays, we are seeing multi-modal tasks being performed using a single model i.e Text translation from English to French, Spanish and German language using a single model! Since the input and output are always text in the form of sequences, this architecture is popularly known as Seq2Seq.