There are some bits of code that I constantly forget, despite needing them all the time. I decided to keep a running list of them all, and thought I would make it public in case they are useful to anyone else. (I can’t guarantee these are the most efficient—or even the right—ways to do any of these things!)
Most of these use the data.table
package. You can install it and load it with these commands:
install.package("data.table")
library(data.table)
Creating a Median Split
d[, MedianSplitVariable := ifelse(ContinuousVariableUsedForSplit > MedianValue, 1, 0)]
Scale a Range of Predictors
This would scale all of the predictors in a dataset called d
, in columns 2 to 34.
preds <- colnames(d[, 2:34])
d <- d[, (preds) := lapply(.SD, scale), .SDcols=preds]
vlookup in R
This does the equivalent of a vlookup from Excel in R
data$var <- data2[match(data$matchv, data2$matchv),]$var
Set a Reference Level for a Categorical Predictor
d <- within(d, Shape <- relevel(Shape, ref = "S"))
Label Levels of a Factor
Variable <- factor(variable, labels = c(l1 = blue, l2 = red))
Renaming Columns
This would rename column Participant.Private.ID
to simply PPT
setnames(d, "Participant.Private.ID", "PPT")
Reshaping Data
Long to wide.
dcast(data, identifier ~ grouping, value.var = "DV", fun.aggregate = mean)
Wide to long.
melt(data, id.vars = , measure.vars = , variable.name = , value.name = )
Plotting Interaction Effects from LMER or GLMER Models
The best solution I’ve found uses the interactions
package.
For categorical predictors:
cat_plot(model, pred = , modx = )
For at least one continuous predictor:
interact_plot(model, pred = , modx = )
Effects Code a Variable
contrasts(d$Condition) <- c(-.5, .5)
Get Rid of Scientific Notation
Gets rid of scientific notation unless the result is wider than 7 digits.
options(scipen=7)
Switch Optimizer/Increase Iterations for Mem
m <- lmer(DV ~ IV + (1 + IV | Subject), data = d, control=lmerControl(optimizer="bobyqa",optCtrl=list(maxfun=100000)))
Load Copied Excel Data into R
Copy the Excel data you want to load. Then run the following.
my_data <- read.table(pipe("pbpaste"), sep="\t", header = TRUE)
Extract a Range of Characters
Uses the stringr
package. The code below would extract the second to fourth characters, inclusive, from a column named stringcolumn
in a dataset called d
.
str_sub(d$stringcolumn, 2, 4)
The code below would extract the first character only.
str_sub(d$stringcolumn, 1, 1)
Extract Text Between Underscores
If you have a column in a data.table
with variables like this 23_Male_Canada
, you can use this code to create three separate columns for the text.
d[, c("Age", "Gender", "Country") := tstrsplit(VARIABLENAME, "_", fixed = TRUE)]