Skip to content

Translation. (experimental)

Usage

textTranslate(
  x,
  source_lang = "",
  target_lang = "",
  model = "xlm-roberta-base",
  device = "cpu",
  tokenizer_parallelism = FALSE,
  logging_level = "warning",
  force_return_results = FALSE,
  return_tensors = FALSE,
  return_text = TRUE,
  clean_up_tokenization_spaces = FALSE,
  set_seed = 202208L,
  max_length = 400
)

Arguments

x

(string) The text to be translated.

source_lang

(string) The input language. Might be needed for multilingual models (it will not have any effect for single pair translation models). using ISO 639-1 Code, such as: "en", "zh", "es", "fr", "de", "it", "sv", "da", "nn".

target_lang

(string) The desired language output. Might be required for multilingual models (will not have any effect for single pair translation models).

model

(string) Specify a pre-trained language model that have been fine-tuned on a translation task.

device

(string) Name of device to use: 'cpu', 'gpu', or 'gpu:k' where k is a specific device number

tokenizer_parallelism

(boolean) If TRUE this will turn on tokenizer parallelism.

logging_level

(string) Set the logging level. Options (ordered from less logging to more logging): critical, error, warning, info, debug

force_return_results

(boolean) Stop returning some incorrectly formatted/structured results. This setting does CANOT evaluate the actual results (whether or not they make sense, exist, etc.). All it does is to ensure the returned results are formatted correctly (e.g., does the question-answering dictionary contain the key "answer", is sentiments from textClassify containing the labels "positive" and "negative").

return_tensors

(boolean) Whether or not to include the predictions' tensors as token indices in the outputs.

return_text

(boolean) Whether or not to also output the decoded texts.

clean_up_tokenization_spaces

(boolean) Whether or not to clean the output from potential extra spaces.

set_seed

(Integer) Set seed.

max_length

Set max length of text to be translated

Value

A tibble with transalted text.

Examples

# \donttest{
# translation_example <- text::textTranslate(
#  Language_based_assessment_data_8[1,1:2],
#  source_lang = "en",
#  target_lang = "fr",
#  model = "t5-base")
# }

GitHub