Predict label and probability of a text using a pretrained classifier language model. STILL UNDER DEVELOPMENT

textClassify(
  x,
  model = "distilbert-base-uncased-finetuned-sst-2-english",
  device = "cpu",
  tokenizer_parallelism = FALSE,
  logging_level = "error",
  return_incorrect_results = FALSE,
  return_all_scores = FALSE,
  function_to_apply = "none"
)

Arguments

x

(string) A character variable or a tibble/dataframe with at least one character variable.

model

(string) Specification of a pre-trained classifier language model. For full list of options see pretrained classifier models at HuggingFace. For example use "cardiffnlp/twitter-roberta-base-sentiment", "distilbert-base-uncased-finetuned-sst-2-english".

device

(string) Device to use: 'cpu', 'gpu', or 'gpu:k' where k is a specific device number.

tokenizer_parallelism

(boolean) If TRUE this will turn on tokenizer parallelism.

logging_level

(string) Set the logging level. Options (ordered from less logging to more logging): critical, error, warning, info, debug

return_incorrect_results

(boolean) Many models are not created to be able to provide classifications - this setting stops them from returning incorrect results.

return_all_scores

(boolean) Whether to return all prediction scores or just the one of the predicted class.

function_to_apply

(string) The function to apply to the model outputs to retrieve the scores. There are four different values: "default": if the model has a single label, will apply the sigmoid function on the output. If the model has several labels, the softmax function will be applied on the output. "sigmoid": Applies the sigmoid function on the output. "softmax": Applies the softmax function on the output. "none": Does not apply any function on the output.

Value

A tibble with predicted labels and scores for each text variable. The comment of the object show the model-name and computation time.

Examples

# \donttest{
#classifications <- textClassify(x = Language_based_assessment_data_8[1:2, 1:2])
#classifications
#comment(classifications)
# }