Word2Vec: A Comparison Between CBOW, SkipGram & SkipGramSI

Learn how different Word2Vec architectures behave in practice. This is to help you make an informed decision on which architecture to use given the problem you are trying to solve. In this article, we will look at how the different neural network architectures for training a Word2Vec model behave in practice. The idea here is to help you make an informed decision on which architecture to use given the problem you are trying to solve. With Word2Vec, we train a neural network with a single hidden layer to predict a target word based on its context (neighboring words). The assumption here is that the meaning of a word can be inferred by the company it keeps. In the end, the goal of training with a neural network, is not to use the resulting neural network itself. Instead, we are looking to extract the weights from the hidden layer with the believe that the these weights encode the meaning of words in the vocabulary.

Source: kavita-ganesan