Test whether there is a significant difference in meaning between two sets of texts (i.e., between their word embeddings).

textSimilarityTest(
  x,
  y,
  Npermutations = 10000,
  method = "paired",
  alternative = c("two_sided", "less", "greater"),
  output.permutations = TRUE,
  N_cluster_nodes = 1,
  seed = 1001
)

Arguments

x

Set of word embeddings from textEmbed.

y

Set of word embeddings from textEmbed.

Npermutations

Number of permutations (default 1000).

method

Compute a "paired" or an "unpaired" test.

alternative

Use a two or one-sided test (select one of: "two_sided", "less", "greater").

output.permutations

If TRUE, returns permuted values in output.

N_cluster_nodes

Number of cluster nodes to use (more makes computation faster; see parallel package).

seed

Set different seed.

Value

A list with a p-value, cosine_estimate and permuted values if output.permutations=TRUE.

Examples

x <- wordembeddings4$harmonywords y <- wordembeddings4$satisfactionwords textSimilarityTest(x, y, method = "paired", Npermutations = 10, N_cluster_nodes = 1, alternative = "two_sided" )
#> $random.estimates.4.null #> [1] 0.4983119 0.5576852 0.5302025 0.5523948 0.5192839 0.5069734 0.5426047 #> [8] 0.5364955 0.5186255 0.5659261 #> #> $embedding_x #> [1] "x : " #> #> $embedding_y #> [1] "y : " #> #> $test_description #> [1] "permutations = 10 method = paired alternative = two_sided" #> #> $time_date #> [1] "Duration to run the test: 0.031413 secs; Date created: 2021-02-12 19:00:50" #> #> $cosine_estimate #> [1] 0.6069308 #> #> $p.value #> [1] 0.09090909 #>