Summarize texts STILL UNDER DEVELOPMENT

textSum(
x,
min_length = 10L,
max_length = 20L,
model = "t5-small",
device = "cpu",
tokenizer_parallelism = FALSE,
logging_level = "warning",
return_incorrect_results = FALSE,
return_text = TRUE,
return_tensors = FALSE,
clean_up_tokenization_spaces = FALSE
)

## Arguments

x

(string) A variable or a tibble/dataframe with at least one character variable.

min_length

(explicit integer; e.g., 10L) The minimum number of tokens in the summed output.

max_length

(explicit integer higher than min_length; e.g., 20L) The maximum number of tokens in the summed output.

model

(string) Specififcation of a pre-trained language model that have been fine-tuned on a summarization task, such as ’bart-large-cnn’, ’t5-small’, ’t5-base’, ’t5-large’, ’t5-3b’, ’t5-11b’.

device

(string) Device to use: 'cpu', 'gpu', or 'gpu:k' where k is a specific device number.

tokenizer_parallelism

(boolean) If TRUE this will turn on tokenizer parallelism.

logging_level

(string) Set the logging level. Options (ordered from less logging to more logging): critical, error, warning, info, debug

return_incorrect_results

(boolean) Many models are not created to be able to provide summerization - this setting stops them from returning incorrect results.

return_text

(boolean) Whether or not the outputs should include the decoded text.

return_tensors

(boolean) Whether or not the output should include the prediction tensors (as token indices).

clean_up_tokenization_spaces

(boolean) Option to clean up the potential extra spaces in the returned text.

## Value

A tibble with summed text(s).

see textClassify, textGeneration, textNER, textSum, textQA, textTranslate
# \donttest{