Installation

textrpp_install() textrpp_install_virtualenv()

Install text required python packages in conda or virtualenv environment

textrpp_uninstall()

Uninstall textrpp conda environment

textrpp_initialize()

Initialize text required python packages

Transform text to word embeddings

textEmbed()

Embed text

textDimName()

Change dimension names

textEmbedRawLayers()

Extract layers of hidden states

textEmbedLayerAggregation()

Aggregate layers

textEmbedReduce()

Pre-trained dimension reduction (experimental)

textEmbedStatic()

Apply static word embeddings

Fine-tuning models

textFineTuneTask()

Task Adapted Pre-Training (EXPERIMENTAL - under development)

textFineTuneDomain()

Domain Adapted Pre-Training (EXPERIMENTAL - under development)

Text language analysis tasks

textGeneration()

Text generation

textNER()

Named Entity Recognition. (experimental)

textSum()

Summarize texts. (experimental)

textQA()

Question Answering. (experimental)

textTranslate()

Translation. (experimental)

textZeroShot()

Zero Shot Classification (Experimental)

The text-train functions

textTrain()

Trains word embeddings

textTrainLists()

Train lists of word embeddings

textTrainRegression()

Train word embeddings to a numeric variable.

textTrainRandomForest()

Trains word embeddings usig random forest

textTrainN()

Cross-validated accuracies across sample-sizes

textTrainNPlot()

Plot cross-validated accuracies across sample sizes

The text-predict functions

textPredict() textAssess() textClassify()

textPredict, textAssess and textClassify

textPredictTest()

Significance testing correlations If only y1 is provided a t-test is computed, between the absolute error from yhat1-y1 and yhat2-y1.

textPredictAll()

Predict from several models, selecting the correct input

textLBAM()

The LBAM library

Semantic similarities and distances functions

textSimilarity()

Semantic Similarity

textDistance()

Semantic distance

textSimilarityMatrix()

Semantic similarity across multiple word embeddings

textDistanceMatrix()

Semantic distance across multiple word embeddings

textSimilarityNorm()

Semantic similarity between a text variable and a word norm

textDistanceNorm()

Semantic distance between a text variable and a word norm

Visualise words in the word embedding space

textProjection()

Supervised Dimension Projection

textPlot()

Plot words

textProjectionPlot()

Plot Supervised Dimension Projection

textCentrality()

Semantic similarity score between single words' and an aggregated word embeddings

textCentralityPlot()

Plots words from textCentrality()

textPCA()

textPCA()

textPCAPlot()

textPCAPlot

BERTopics

textTopics()

BERTopics

textTopicsTest()

Wrapper for topicsTest function from the topics package

textTopicsWordcloud()

Plot word clouds

textTopicsReduce()

textTopicsReduce (EXPERIMENTAL)

textTopicsTree()

textTopicsTest (EXPERIMENTAL) to get the hierarchical topic tree

View or delete downloaded HuggingFace models

textModels()

Check downloaded, available models.

textModelLayers()

Number of layers

textModelsRemove()

Delete a specified model

Miscellaneous

textDescriptives()

Compute descriptive statistics of character variables.

textClean()

Cleans text from standard personal information

textTokenize()

Tokenize text-variables

textTokenizeAndCount()

Tokenize and count

textDomainCompare()

Compare two language domains

textFindNonASCII()

Detect non-ASCII characters

textCleanNonASCII()

Clean non-ASCII characters

Example Data

Language_based_assessment_data_8

Text and numeric data for 10 participants.

word_embeddings_4

Word embeddings for 4 text variables for 40 participants

raw_embeddings_1

Word embeddings from textEmbedRawLayers function

Language_based_assessment_data_3_100

Example text and numeric data.

DP_projections_HILS_SWLS_100

Data for plotting a Dot Product Projection Plot.

centrality_data_harmony

Example data for plotting a Semantic Centrality Plot.

PC_projections_satisfactionwords_40

Example data for plotting a Principle Component Projection Plot.