`R/3_1_textSimilarity.R`

`textDistanceMatrix.Rd`

Compute semantic distance scores between all combinations in a word embedding

`textDistanceMatrix(x, method = "euclidean", center = FALSE, scale = FALSE)`

- x
Word embeddings (from textEmbed).

- method
Character string describing type of measure to be computed; default is "euclidean" (see also measures from stats:dist() including "maximum", "manhattan", "canberra", "binary" and "minkowski". It is also possible to use "cosine", which computes the cosine distance (i.e., 1 - cosine(x, y)).

- center
(boolean; from base::scale) If center is TRUE then centering is done by subtracting the embedding mean (omitting NAs) of x from each of its dimension, and if center is FALSE, no centering is done.

- scale
(boolean; from base::scale) If scale is TRUE then scaling is done by dividing the (centered) embedding dimensions by the standard deviation of the embedding if center is TRUE, and the root mean square otherwise.

A matrix of semantic distance scores

see `textDistanceNorm`

```
distance_scores <- textDistanceMatrix(word_embeddings_4$texts$harmonytext[1:3, ])
round(distance_scores, 3)
#> [,1] [,2] [,3]
#> [1,] 0.000 0.706 0.920
#> [2,] 0.706 0.000 0.568
#> [3,] 0.920 0.568 0.000
```