R/2_1_textTrain.R
textTrainLists.Rd
Individually trains word embeddings from several text variables to several numeric or categorical variables. It is possible to have word embeddings from one text variable and several numeric/categprical variables; or vice verse, word embeddings from several text variables to one numeric/categorical variable. It is not possible to mix numeric and categorical variables.
textTrainLists( x, y, force_train_method = "automatic", save_output = "all", method_cor = "pearson", model = "regression", eval_measure = "rmse", p_adjust_method = "holm", ... )
x | Word embeddings from textEmbed (or textEmbedLayerAggreation). |
---|---|
y | Tibble with several numeric or categorical variables to predict. Please note that you cannot mix numeric and categorical variables. |
force_train_method | Default is automatic; see also "regression" and "random_forest". |
save_output | Option not to save all output; default "all". see also "only_results" and "only_results_predictions". |
method_cor | Default "Pearson". |
model | Type of model to use in regression; default is "regression"; see also "logistic". (To set different random forest algorithms see extremely_randomised_splitrule parameter in textTrainRandomForest) |
eval_measure | Type of evaluative measure to assess models on. |
p_adjust_method | Method to adjust/correct p-values for multiple comparisons (default = "holm"; see also "none", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr"). |
... | Arguments from textTrainRegression or textTrainRandomForest the textTrain function. |
Correlations between predicted and observed values.
if (FALSE) { wordembeddings <- wordembeddings4[1:2] ratings_data <- Language_based_assessment_data_8[5:6] results <- textTrainLists( wordembeddings, ratings_data ) results comment(results) }