Compute predictions based on single words for plotting words. The word embeddings of single words are trained to predict the mean value associated with that word. P-values does NOT work yet (experimental).

textWordPrediction(
  words,
  word_types_embeddings = word_types_embeddings_df,
  x,
  y = NULL,
  seed = 1003,
  case_insensitive = TRUE,
  text_remove = "[()]",
  ...
)

Arguments

words

Word or text variable to be plotted.

word_types_embeddings

Word embeddings from textEmbed for individual words (i.e., decontextualized embeddings).

x

Numeric variable that the words should be plotted according to on the x-axes.

y

Numeric variable that the words should be plotted according to on the y-axes (y=NULL).

seed

Set different seed.

case_insensitive

When TRUE all words are made lower case.

text_remove

Remove special characters

...

Training options from textTrainRegression().

Value

A dataframe with variables (e.g., including trained (out of sample) predictions, frequencies, p-values) for the individual words that is used for the plotting in the textProjectionPlot function.

Examples

# Data
# Pre-processing data for plotting
if (FALSE) {
df_for_plotting <- textWordPrediction(
  words = Language_based_assessment_data_8$harmonywords,
  word_types_embeddings = word_embeddings_4$word_types,
  x = Language_based_assessment_data_8$hilstotal
)
df_for_plotting
}
#' @seealso see \code{\link{textProjection}}