textTrainLists() individually trains word embeddings from several text variables to several numeric or categorical variables.
textTrainLists(
x,
y,
force_train_method = "automatic",
save_output = "all",
method_cor = "pearson",
eval_measure = "rmse",
p_adjust_method = "holm",
...
)
Word embeddings from textEmbed (or textEmbedLayerAggreation). It is possible to have word embeddings from one text variable and several numeric/categorical variables; or vice verse, word embeddings from several text variables to one numeric/categorical variable. It is not possible to mix numeric and categorical variables.
Tibble with several numeric or categorical variables to predict. Please note that you cannot mix numeric and categorical variables.
(character) Default is "automatic"; see also "regression" and "random_forest".
(character) Option not to save all output; default "all". See also "only_results" and "only_results_predictions".
(character) A character string describing type of correlation (default "Pearson").
(character) Type of evaluative measure to assess models on (default "rmse").
Method to adjust/correct p-values for multiple comparisons. (default = "holm"; see also "none", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr").
Arguments from textTrainRegression or textTrainRandomForest (the textTrain function).
Correlations between predicted and observed values (t-value, degree of freedom (df), p-value, confidence interval, alternative hypothesis, correlation coefficient) stored in a dataframe.
# Examines how well the embeddings from Language_based_assessment_data_8 can
# predict the numerical numerical variables in Language_based_assessment_data_8.
# The training is done combination wise, i.e., correlations are tested pair wise,
# column: 1-5,1-6,2-5,2-6, resulting in a dataframe with four rows.
if (FALSE) { # \dontrun{
word_embeddings <- word_embeddings_4$texts[1:2]
ratings_data <- Language_based_assessment_data_8[5:6]
trained_model <- textTrainLists(
x = word_embeddings,
y = ratings_data
)
# Examine results (t-value, degree of freedom (df), p-value,
# alternative-hypothesis, confidence interval, correlation coefficient).
trained_model$results
} # }