Semantics & Sentiment Analysis:

Word2vec:

Two layer neural net that processes text
Input is a text corpus, & ouput is a set of vectors:
feature vectors for words in that corpus
Purpose is to group the vectors of similar words together in vectorspace
It detects similarities mathematically.
Creates vectors that are distributed numerical representations of word features i.e context of individual words,
It does so without human intervention.
It's just learning naturally from a very large corpus of text, the context of different words and which words are similar to other words

Word2vec trains words against other words that neighbour them in the input corpus.It does so in 1 of 2 ways:

Either using context to predict a target word i.e CBOW (Continuous bag of words)
Or using a word to predict the target context i.e skip gram

Each word is going to be represented by a vector
In spacy, each of the vector has 300 dimensions

If you try to implement word to vec on yourself, it takes a very long time on very large corpus.So usually the built in embedded word2vec are used.

If you want to train your on auto-encoder for word to vector, theoritically you can choose between 100 to 1000 dimensions.

Higher the dimension more the training time
Higher dimension also means more context around each of the words, since you have more dimensions to hold more information.

Since we have each word mapped to a vector in this 300 dimensional space. We can use cosine similarity to measure how similar word vectors are to each other.

Cosine similarity is just checking the distance between two vectors.

here we see simply diagram in 2 dimensional space.

But this expands out to N dimension.

In our case, we'll be taking several 300 dimensional vectors and then calculating the coisne similarity between them, to see what vectors are most similar to each other, here actual vectors represents words

This also means we can perform vector arithmetic with the new word vectors.

So we can calculate a brand new vector by performing arithmetic that is adding or subtracting different vectors.

So i can take :

Vector(King) - Vector(Men) + Vector (Women)

here i can now, attempt to find the most similar existing vector to this new vector

So it close existing vector could be queen.

Vector(King) - Vector(Men) + Vector (Women) = Vector (Queen)

Essentially

Understanding the context of royalty along 1 dimension
Moving along another dimension for gender

So this is able to establish really interesting relationships between the word vectors. Including relationship between male vs female or even dimension of verb tense. Walking is to walk, swimming is to swim.