`R/3_2_textSimilarityTest.R`

`textSimilarityTest.Rd`

Test whether there is a significant difference in meaning between two sets of texts (i.e., between their word embeddings).

textSimilarityTest( x, y, Npermutations = 10000, method = "paired", alternative = c("two_sided", "less", "greater"), output.permutations = TRUE, N_cluster_nodes = 1, seed = 1001 )

x | Set of word embeddings from textEmbed. |
---|---|

y | Set of word embeddings from textEmbed. |

Npermutations | Number of permutations (default 1000). |

method | Compute a "paired" or an "unpaired" test. |

alternative | Use a two or one-sided test (select one of: "two_sided", "less", "greater"). |

output.permutations | If TRUE, returns permuted values in output. |

N_cluster_nodes | Number of cluster nodes to use (more makes computation faster; see parallel package). |

seed | Set different seed. |

A list with a p-value, cosine_estimate and permuted values if output.permutations=TRUE.

x <- wordembeddings4$harmonywords y <- wordembeddings4$satisfactionwords textSimilarityTest(x, y, method = "paired", Npermutations = 10, N_cluster_nodes = 1, alternative = "two_sided" )#> $random.estimates.4.null #> [1] 0.4983119 0.5576852 0.5302025 0.5523948 0.5192839 0.5069734 0.5426047 #> [8] 0.5364955 0.5186255 0.5659261 #> #> $embedding_x #> [1] "x : " #> #> $embedding_y #> [1] "y : " #> #> $test_description #> [1] "permutations = 10 method = paired alternative = two_sided" #> #> $time_date #> [1] "Duration to run the test: 0.031413 secs; Date created: 2021-02-12 19:00:50" #> #> $cosine_estimate #> [1] 0.6069308 #> #> $p.value #> [1] 0.09090909 #>