There are some bits of code that I constantly forget, despite needing them all the time. I decided to keep a running list of them all, and thought I would make it public in case they are useful to anyone else. (I can’t guarantee these are the most efficient—or even the right—ways to do any of these things!)

Most of these use the data.table package. You can install it and load it with these commands:

install.package("data.table")

library(data.table)

Creating a Median Split

d[, MedianSplitVariable := ifelse(ContinuousVariableUsedForSplit > MedianValue, 1, 0)]

Scale a Range of Predictors

This would scale all of the predictors in a dataset called d, in columns 2 to 34.

preds <- colnames(d[, 2:34])

d <- d[, (preds) := lapply(.SD, scale), .SDcols=preds]

vlookup in R

This does the equivalent of a vlookup from Excel in R

data$var <- data2[match(data$matchv, data2$matchv),]$var

Set a Reference Level for a Categorical Predictor

d <- within(d, Shape <- relevel(Shape, ref = "S"))

Label Levels of a Factor

Variable <- factor(variable, labels = c(l1 = blue, l2 = red))

Renaming Columns

This would rename column Participant.Private.ID to simply PPT

setnames(d, "Participant.Private.ID", "PPT")

Reshaping Data

Long to wide.

dcast(data, identifier ~ grouping, value.var = "DV", fun.aggregate = mean)

Wide to long.

melt(data, id.vars = , measure.vars = , variable.name = , value.name = )

Plotting Interaction Effects from LMER or GLMER Models

The best solution I’ve found uses the interactions package.

For categorical predictors:

cat_plot(model, pred = , modx = )

For at least one continuous predictor:

interact_plot(model, pred = , modx = )

Effects Code a Variable

contrasts(d$Condition) <- c(-.5, .5)

Get Rid of Scientific Notation

Gets rid of scientific notation unless the result is wider than 7 digits.

options(scipen=7)

Switch Optimizer/Increase Iterations for Mem

m <- lmer(DV ~ IV + (1 + IV | Subject), data = d, control=lmerControl(optimizer="bobyqa",optCtrl=list(maxfun=100000)))

Load Copied Excel Data into R

Copy the Excel data you want to load. Then run the following.

my_data <- read.table(pipe("pbpaste"), sep="\t", header = TRUE)

Extract a Range of Characters

Uses the stringr package. The code below would extract the second to fourth characters, inclusive, from a column named stringcolumn in a dataset called d.

str_sub(d$stringcolumn, 2, 4)

The code below would extract the first character only.

str_sub(d$stringcolumn, 1, 1)

Extract Text Between Underscores

If you have a column in a data.table with variables like this 23_Male_Canada, you can use this code to create three separate columns for the text.

d[, c("Age", "Gender", "Country") := tstrsplit(VARIABLENAME, "_", fixed = TRUE)]