Skip to content

Installation

textrpp_install() textrpp_install_virtualenv()
Install text required python packages in conda or virtualenv environment
textrpp_uninstall()
Uninstall textrpp conda environment
textrpp_initialize()
Initialize text required python packages

Transform text to word embeddings

textEmbed()
Embed text
textDimName()
Change dimension names
textEmbedRawLayers()
Extract layers of hidden states
textEmbedLayerAggregation()
Aggregate layers
textEmbedReduce()
Pre-trained dimension reduction (experimental)
textEmbedStatic()
Apply static word embeddings

Fine-tuning models

textFineTuneTask()
Task Adapted Pre-Training (EXPERIMENTAL - under development)
textFineTuneDomain()
Domain Adapted Pre-Training (EXPERIMENTAL - under development)

Text language analysis tasks

textGeneration()
Text generation
textNER()
Named Entity Recognition. (experimental)
textSum()
Summarize texts. (experimental)
textQA()
Question Answering. (experimental)
textTranslate()
Translation. (experimental)
textZeroShot()
Zero Shot Classification (Experimental)

The text-train functions

textTrain()
Trains word embeddings
textTrainLists()
Train lists of word embeddings
textTrainRegression()
Train word embeddings to a numeric variable.
textTrainRandomForest()
Trains word embeddings usig random forest
textTrainN()
Cross-validated accuracies across sample-sizes
textTrainNPlot()
Plot cross-validated accuracies across sample sizes

The text-predict functions

textPredict() textAssess() textClassify()
textPredict, textAssess and textClassify
textPredictTest()
Significance testing correlations If only y1 is provided a t-test is computed, between the absolute error from yhat1-y1 and yhat2-y1.
textPredictAll()
Predict from several models, selecting the correct input
textLBAM()
The LBAM library

Semantic similarities and distances functions

textSimilarity()
Semantic Similarity
textDistance()
Semantic distance
textSimilarityMatrix()
Semantic similarity across multiple word embeddings
textDistanceMatrix()
Semantic distance across multiple word embeddings
textSimilarityNorm()
Semantic similarity between a text variable and a word norm
textDistanceNorm()
Semantic distance between a text variable and a word norm

Visualise words in the word embedding space

textProjection()
Supervised Dimension Projection
textPlot()
Plot words
textProjectionPlot()
Plot Supervised Dimension Projection
textCentrality()
Semantic similarity score between single words' and an aggregated word embeddings
textCentralityPlot()
Plots words from textCentrality()
textPCA()
textPCA()
textPCAPlot()
textPCAPlot

BERTopics

textTopics()
BERTopics
textTopicsTest()
Wrapper for topicsTest function from the topics package
textTopicsWordcloud()
Plot word clouds
textTopicsReduce()
textTopicsReduce (EXPERIMENTAL)
textTopicsTree()
textTopicsTest (EXPERIMENTAL) to get the hierarchical topic tree

View or delete downloaded HuggingFace models

textModels()
Check downloaded, available models.
textModelLayers()
Number of layers
textModelsRemove()
Delete a specified model

Miscellaneous

textDescriptives()
Compute descriptive statistics of character variables.
textClean()
Cleans text from standard personal information
textTokenize()
Tokenize text-variables
textTokenizeAndCount()
Tokenize and count
textDomainCompare()
Compare two language domains
textFindNonASCII()
Detect non-ASCII characters
textCleanNonASCII()
Clean non-ASCII characters

Example Data

Language_based_assessment_data_8
Text and numeric data for 10 participants.
word_embeddings_4
Word embeddings for 4 text variables for 40 participants
raw_embeddings_1
Word embeddings from textEmbedRawLayers function
Language_based_assessment_data_3_100
Example text and numeric data.
DP_projections_HILS_SWLS_100
Data for plotting a Dot Product Projection Plot.
centrality_data_harmony
Example data for plotting a Semantic Centrality Plot.
PC_projections_satisfactionwords_40
Example data for plotting a Principle Component Projection Plot.

GitHub