Hello and welcome back to the NLP Tutorials! In our previous article we had a discussion on one of the popular Word Embedding technique — Word2Vec. It was a revolutionary word representation technique which changed the face of solving NLP problems. Although Word2Vec was good, it still has a few drawbacks which were strongly overcome by the GloVe Word Embeddings. GloVe stands for Global Vectors. This embedding model is mainly based on capturing vector statistics in global context. Due to capturing more data on a global level (document), it is high-dimensional and memory intensive but gives excellent results in a majority of NLP tasks. Let’s quickly get into the details of GloVe embeddings.
Welcome back to the NLP Tutorials! Hope y’all had a good time reading my previous articles and were able to learn and make progress in your journey to NLP proficiency! In our previous post we looked at a project — Document Similarity using two vectorizers — CountVectorizer & Tf-Idf Vectorizer. I hope you tried your hand at Document Similarity with various other techniques and datasets. In this article we shall dive deep into the world of Text Embeddings, which are more advanced and sophisticated ways of representing text in vector form. There are many Word Embeddings out there, but in this article we shall have an overview of Word2Vec, one of the earliest and most famous Word Embeddings developed and published by Google. Let’s get started then!