Hello again, glad to welcome you back to this article on Text Classification in the NLP Tutorials series. In our previous posts we had a detailed overview on the fundamental text representation — CountVectorizer & Tf-Idf Vectorizer and also the two most prominent Word Embeddings — Word2Vec & GloVe. In this article we will put our knowledge to task — Build a Text Classification model using all these techniques and analyse the results.
Welcome back to the NLP Tutorials! In our previous posts we had a detailed look at Text Representation & Word Embeddings, which are ways to accurately convert the text into vector form. The corpus in vector form is easily stored, accessible and can be used further for solving the NLP problem at hand. In this article, we shall try our hand at a small NLP problem - Document Similarity/Text Similarity. Without wasting much time, let’s quickly get started!
Hello and welcome back to the NLP Tutorials Series! Today we will move forward on the Road to becoming proficient in NLP and delve into Text Representation and Word Embeddings. To put it in simple terms, Text Representation is a way to convert text in its natural form to vector form - Machines like it and understand it in this way only! The numbers/vectors form. This is the second step in an NLP pipeline after Text Pre-processing. Let’s get started with a sample corpus, pre-process and then keep ‘em ready for Text Representation.