Files
cours/M1 LOGOS .machine learning for NLP.md

714 B

up, tags, aliases
up tags aliases
M1 LOGOS
s/fac
s/informatique

Vocabulary

\underbrace{(x_1, x_2, \dots, x_{n})}_{\text{vector of length } n} \in \mathbb{R}^{n}

x_{i} \in \mathbb{R} is a scalar

one-hot : boolean vector with all zeroes but one value. Usefull if each dimension represents a word of the vocabulary

BOW : Bag Of Words You could represent sentences like that : Let our vocabulary be : V = 'le' 'un' 'garcon' 'lit' 'livre' 'regarde' Then "le garcon lit le livre" would be written by counting the number of occurences of each word of the sentence in a vector, so 2 0 1 1 1 0 (the formula is sentence +⌿⍤(∘.≡) vocabulary)

\cos(u, v) = \frac{u\cdot v}{\|u\| \| v\|}