```
library(faux)
library(dplyr)
library(tidyr)
library(ggplot2)
library(cowplot) # for multi-panel plots
```

```
<- sim_design(within = list(vars = c("dv", "predictor")),
dat mu = list(dv = 100, predictor = 0),
sd = list(dv = 10, predictor = 1),
r = 0.5, plot = FALSE)
```

Here, pred1 is correlated r = 0.5 to the DV, and pred2 is correlated 0.0 to the DV, and pred1 and pred2 are correlated r = -0.2 to each other.

```
<- sim_design(within = list(vars = c("dv", "pred1", "pred2")),
dat mu = list(dv = 100, pred1 = 0, pred2 = 0),
sd = list(dv = 10, pred1 = 1, pred2 = 1),
r = c(0.5, 0, -0.2), plot = FALSE)
```

If the continuous predictors are within-subjects (e.g., dv and predictor are measured at pre- an post-test), you can set it up like below.

The correlation matrix can start getting tricky, so I usually map out the upper right triangle of the correlation matrix separately. Here, the dv and predictor are correlated 0.0 in the pre-test and 0.5 in the post-test. The dv is correlated 0.8 between pre- and post-test and the predictor is correlated 0.3 between pre- and post-test. There is no correlation between the pre-test predictor and the post-test dv, but Iâ€™m not sure what values are possible then for the correlation between the post-test predictor and pre-test dv, so I can set that to NA and use the `pos_def_limits`

function to determine the range of possible correlations (gven the existing correlation structure). Those range from -0.08 to 0.88, so Iâ€™ll set the value to the mean.

```
# pre_pred, post_dv, post_pred
<- c( 0.0, 0.8, NA, # pre_dv
r 0.0, 0.3, # pre_pred
0.5) # post_dv
<- faux::pos_def_limits(r)
lim 3]] <- mean(c(lim$min, lim$max))
r[[
<- sim_design(within = list(time = c("pre", "post"),
dat vars = c("dv", "pred")),
mu = list(pre_dv = 100, pre_pred = 0,
post_dv = 110, post_pred = 0.1),
sd = list(pre_dv = 10, pre_pred = 1,
post_dv = 10, post_pred = 1),
r = r, plot = FALSE)
```

You have to make this sort of dataset in wide format and then manually convert it to long. I prefer `gather`

and `spread`

, but Iâ€™m trying to learn the new pivot functions, so Iâ€™ll use them here.

```
<- dat %>%
long_dat pivot_longer(-id, "var", "value") %>%
separate(var, c("time", "var")) %>%
pivot_wider(names_from = var, values_from = value)
```

In this design, the DV is 10 higher for group B than group A and the correlation between the predictor and DV is 0.5 for group A and 0.0 for group B.

```
<- sim_design(between = list(group = c("A", "B")),
dat within = list(vars = c("dv", "predictor")),
mu = list(A = c(dv = 100, predictor = 0),
B = c(dv = 110, predictor = 0)),
sd = list(A = c(dv = 10, predictor = 1),
B = c(dv = 10, predictor = 1)),
r = list(A = 0.5, B = 0), plot = FALSE)
```

If you already have a dataset and want to add a continuous predictor, you can make a new column with a specified mean, SD and correlation to one other column.

First, letâ€™s make a simple dataset with one between-subject factor.

```
<- sim_design(between = list(group = c("A", "B")),
dat mu = list(A = 100, B = 120), sd = 10, plot = FALSE)
```

Now we can add a continuous predictor with `rnorm_pre`

by specifying the vector it should be correlated with, the mean, and the SD. By default, this produces values sampled from a population with that mean, SD and r. If you set `empirical`

to TRUE, the resulting vector will have that sample mean, SD and r.

`$pred <- rnorm_pre(dat$y, 0, 1, 0.5) dat`

If you want to set a different mean, SD or r for the between-subject groups, you can split and re-merge the dataset (or use your data wrangling skills to devise a more elegant way using purrr).

```
<- filter(dat, group == "A") %>%
A mutate(pred = rnorm_pre(y, 0, 1, -0.5))
<- filter(dat, group == "B") %>%
B mutate(pred = rnorm_pre(y, 0, 1, 0.5))
<- bind_rows(A, B) dat
```

You can also specify correlations to more than one vector by setting the first argument to a data frame containing only the continuous columns and r to the correlation with each column.

```
<- sim_design(2, r = 0.5, plot = FALSE)
dat $B <- rnorm_pre(dat[, 2:3], r = c(A1 = 0.5, A2 = 0))
datcor(dat[, 2:4])
#> A1 A2 B
#> A1 1.0000000 0.2736529 0.6455051
#> A2 0.2736529 1.0000000 0.1352178
#> B 0.6455051 0.1352178 1.0000000
```

Not all correlation patterns are possible, so youâ€™ll get an error message if the correlations you ask for are impossible.

```
$C <- rnorm_pre(dat[, 2:4], r = c(A1 = 0.9, A2 = 0.9, B = -0.9))
dat#> Warning in rnorm_pre(dat[, 2:4], r = c(A1 = 0.9, A2 = 0.9, B = -0.9)):
#> Correlations are impossible.
```