Individually trains word embeddings from several text variables to several numeric or categorical variables. It is possible to have word embeddings from one text variable and several numeric/categprical variables; or vice verse, word embeddings from several text variables to one numeric/categorical variable. It is not possible to mix numeric and categorical variables.

textTrainLists(
  x,
  y,
  force_train_method = "automatic",
  save_output = "all",
  method_cor = "pearson",
  model = "regression",
  eval_measure = "rmse",
  p_adjust_method = "holm",
  ...
)

Arguments

x

Word embeddings from textEmbed (or textEmbedLayerAggreation).

y

Tibble with several numeric or categorical variables to predict. Please note that you cannot mix numeric and categorical variables.

force_train_method

Default is automatic; see also "regression" and "random_forest".

save_output

Option not to save all output; default "all". see also "only_results" and "only_results_predictions".

method_cor

Default "Pearson".

model

Type of model to use in regression; default is "regression"; see also "logistic". (To set different random forest algorithms see extremely_randomised_splitrule parameter in textTrainRandomForest)

eval_measure

Type of evaluative measure to assess models on.

p_adjust_method

Method to adjust/correct p-values for multiple comparisons (default = "holm"; see also "none", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr").

...

Arguments from textTrainRegression or textTrainRandomForest the textTrain function.

Value

Correlations between predicted and observed values.

See also

Examples

if (FALSE) { wordembeddings <- wordembeddings4[1:2] ratings_data <- Language_based_assessment_data_8[5:6] results <- textTrainLists( wordembeddings, ratings_data ) results comment(results) }