Issue with inclusion of "weights" #74

edubwalsh · 2023-09-08T15:55:04Z

Greetings, when running the program with lmer function (and formula for multilevel modeling), the output changes based on the inclusion (or exclusion) of "weights").

For example, when running the following, the mean, +1SD, -1SD will be incorrectly calculated:

Example 1:

      mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                       fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                       LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                       data = mod_pred, weights = weights, REML = TRUE) 

      # JN analysis and plots:
      jn <- probe_interaction(model=mod_jn, 
                       pred = LupTime,
                       modx = fMRI_Diff_RPut_FB_Slow_Win,
                       digits = getOption("jtools-digits", 2),
                       jnplot = TRUE,
                       confint = TRUE
                       )

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.2538121 (- 1 SD): 

   Est.   S.E.    2.5%   97.5%   t val.      p
------- ------ ------- ------- -------- ------
  -0.28   0.09   -0.46   -0.09    -2.97   0.00

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.5633464 (Mean): 

   Est.   S.E.    2.5%   97.5%   t val.      p
------- ------ ------- ------- -------- ------
  -0.45   0.06   -0.57   -0.33    -7.20   0.00

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.8728807 (+ 1 SD): 

   Est.   S.E.    2.5%   97.5%   t val.      p
------- ------ ------- ------- -------- ------
  -0.62   0.12   -0.86   -0.38    -5.14   0.00

If I remove "weights" from the code, the mean, +1SD, -1SD will be CORRECTLY calculated and the plot appears as expected. this doesn't represent the complete formula for my model. Additionally, weighting seems most appropriate.

Example 2:

      mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                       fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                       LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                       data = mod_pred, REML = TRUE) 

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = -0.36811581 (- 1 SD): 

   Est.   S.E.    2.5%   97.5%   t val.      p
------- ------ ------- ------- -------- ------
  -0.12   0.14   -0.39    0.14    -0.91   0.37

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.07231536 (Mean): 

  Est.   S.E.    2.5%   97.5%   t val.      p
------ ------ ------- ------- -------- ------
  0.05   0.07   -0.09    0.20     0.73   0.47

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.51274654 (+ 1 SD): 

  Est.   S.E.    2.5%   97.5%   t val.      p
------ ------ ------- ------- -------- ------
  0.23   0.14   -0.04    0.50     1.69   0.09

Finally, if I leave the full formula and attempt to fix the mod.x values, the lines appear flipped -- that is, where -1 SD should be, +1 SD is, and vice versa, on the interaction plot. This is in contrast to both Example 2 (above) and ggplots (accounting for weights).

Example 3:

      jn <- probe_interaction(model=mod_jn, 
                       pred = LupTime,
                       modx = fMRI_Diff_RPut_FB_Slow_Win,
                       modx.values = c(-0.37, 0.07, 0.51),
                       modx.labels = c("-1 SD", "Mean", "+1 SD"),
                       digits = getOption("jtools-digits", 2),
                       jnplot = TRUE,
                       confint = TRUE
                       )

Any insight on what might be driving this discrepancy when using "weights" in the formula? Thank you for your time!

jacob-long · 2023-09-19T17:38:23Z

In my brief attempt to replicate the problem, I wasn't able to. However, I have a suspicion that you could check for me. Does this issue keep coming up if you use sim_slopes() instead of probe_interaction()?

edubwalsh · 2023-09-19T18:13:54Z

Thank you for the response and help! I appreciate it. Yes, unfortunately, the issue replicates when using sim_slopes() and interact_plot(). Below is output using sim_slopes().

Example 1, the mean and SDs are incorrectly calculated when using "weights" in the formula:

       mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                        fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                        LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                        data = mod_pred, weights = weights, REML = TRUE) 

       jn <- sim_slopes(model=mod_jn, 
                        pred = LupTime,
                        modx = fMRI_Diff_RPut_FB_Slow_Win
                        )
        print(jn)

Output from Example 1:

Warning: 0.872880688222861 is outside the observed range of fMRI_Diff_RPut_FB_Slow_Win
JOHNSON-NEYMAN INTERVAL 

When fMRI_Diff_RPut_FB_Slow_Win is INSIDE the interval [0.16, 63.81],
the slope of LupTime is p < .05.

Note: The range of observed values of fMRI_Diff_RPut_FB_Slow_Win is
[-1.20, 0.75]

SIMPLE SLOPES ANALYSIS 

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.2538121 (- 1 SD): 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.28   0.09    -2.97   0.00

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.5633464 (Mean): 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.45   0.06    -7.20   0.00

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.8728807 (+ 1 SD): 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.62   0.12    -5.14   0.00

Example 2, when removing weights from the formula (and with or without including REML = TRUE), the mean, SDs are correctly calculated, though the JN interval could not be found:

       mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                        fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                        LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                        data = mod_pred, REML = TRUE) 
 
       jn <- sim_slopes(model=mod_jn, 
                        pred = LupTime,
                        modx = fMRI_Diff_RPut_FB_Slow_Win
                        )
       print(jn)

Output for Example 2:

JOHNSON-NEYMAN INTERVAL 

The Johnson-Neyman interval could not be found. Is the p value for your
interaction term below the specified alpha?

SIMPLE SLOPES ANALYSIS 

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = -0.36811581 (- 1 SD): 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.12   0.14    -0.91   0.37

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.07231536 (Mean): 

  Est.   S.E.   t val.      p
------ ------ -------- ------
  0.05   0.07     0.73   0.47

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.51274654 (+ 1 SD): 

  Est.   S.E.   t val.      p
------ ------ -------- ------
  0.23   0.14     1.69   0.09

Example 3 uses weights and fixes the modx values. Though, the same issue is occuring, where -1 SD should be significant, but it is showing as +1 SD.

      mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                       fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                       LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                       data = mod_pred, weights = weights, REML = TRUE)  

      jn <- sim_slopes(model=mod_jn, 
                       pred = LupTime,
                       modx = fMRI_Diff_RPut_FB_Slow_Win,
                       modx.values = c(-0.37, 0.07, 0.51),
                       )
        print(jn)

Output from Example 3:

JOHNSON-NEYMAN INTERVAL 

When fMRI_Diff_RPut_FB_Slow_Win is INSIDE the interval [0.16, 63.81],
the slope of LupTime is p < .05.

Note: The range of observed values of fMRI_Diff_RPut_FB_Slow_Win is
[-1.20, 0.75]

SIMPLE SLOPES ANALYSIS 

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = -0.37: 

  Est.   S.E.   t val.      p
------ ------ -------- ------
  0.07   0.25     0.28   0.78

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.07: 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.17   0.14    -1.27   0.21

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.51: 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.42   0.06    -6.98   0.00

edubwalsh · 2023-10-16T21:10:55Z

Checking in to see if you have any additional thoughts on this issue? It does occur with another dataset in our lab, again, when using weights in the lmer model. We tried renaming "weights" column in the event it was conflicting with a variable in the source code, however, it did not change the outcome. Thank you!

jacob-long · 2024-01-07T19:41:48Z

Focusing for the moment on the issue of correctly calculating mean and SD, I am unable to replicate it. Here's my code to generate a fake multilevel dataset with sampling weights then do some basic analyses that don't seem to include mistakes:

library(dplyr)

# Set the seed for reproducibility
set.seed(123)

# Define the number of groups and the number of individuals per group
n_groups <- 10
n_individuals_per_group <- 20

# Create a multilevel dataset
data <- expand.grid(group_id = 1:n_groups, individual_id = 1:n_individuals_per_group)

# Generate some group-level and individual-level variables
data$gvar1 <- with(data, ave(group_id, group_id, FUN = function(x) rnorm(1, mean = 100, sd = 15)))
data$gvar2 <- with(data, ave(group_id, group_id, FUN = function(x) rnorm(1, mean = 100, sd = 15)))
data$ivar1 <- rnorm(nrow(data), mean = 100, sd = 15)
data$ivar2 <- data$ivar1 + data$gvar1 + (data$ivar1 * data$gvar1)/100 + rnorm(nrow(data), 0, 15)

# Generate sampling weights (these can be random but should vary)
data$sampling_weight <- runif(nrow(data), min = 0.5, max = 2)

library(lme4)
mod <- lmer(ivar2 ~ gvar1 * ivar1 + (1 | group_id), data = data)
summary(mod)

interact_plot(mod, ivar1, gvar1)
sim_slopes(mod, ivar1, gvar1)

modw <- lmer(ivar2 ~ gvar1 * ivar1 + (1 | group_id), data = data, weights = sampling_weight)
summary(modw)

interact_plot(modw, ivar1, gvar1)
sim_slopes(modw, ivar1, gvar1)

In the case of your output, I suppose my issue is that it's not clear that it is wrong about the slopes, means, SDs because the output doesn't tell me that. I expect it to be different depending on the inclusions of weights because then we are calculating weighted means and SDs. Of course, without the data I can't be sure whether those weighted means and SDs are in fact the correct weighted means and SDs, I can only check it on my contrived example where I can verify it works right.

I'll look into the issue about fixing the moderator values separately and report back.

jacob-long · 2024-01-07T19:54:45Z

Looking into the fixed values/labels thing, I'm not replicating an issue there either --- code below which, when run, seems to show the labels are applying correctly.

In the output you shared and the one plot for the weighted model, the sim_slopes() output and plot seem to agree; a negative slope at lower values of the moderator that becomes positive as the moderator increases.

library(dplyr)

# Set the seed for reproducibility
set.seed(123)

# Define the number of groups and the number of individuals per group
n_groups <- 10
n_individuals_per_group <- 20

# Create a multilevel dataset
data <- expand.grid(group_id = 1:n_groups, individual_id = 1:n_individuals_per_group)

# Generate some group-level and individual-level variables
data$gvar1 <- with(data, ave(group_id, group_id, FUN = function(x) rnorm(1, mean = 100, sd = 15)))
data$gvar2 <- with(data, ave(group_id, group_id, FUN = function(x) rnorm(1, mean = 100, sd = 15)))
data$ivar1 <- rnorm(nrow(data), mean = 100, sd = 15)
data$ivar2 <-(data$ivar1 + data$gvar1 + (data$ivar1 * data$gvar1)/100) + rnorm(nrow(data), 0, 15)

# Generate sampling weights (these can be random but should vary)
data$sampling_weight <- runif(nrow(data), min = 0.5, max = 2)

library(lme4)
mod <- lmer(ivar2 ~ gvar1 * ivar1 + (1 | group_id), data = data)
summary(mod)

interact_plot(mod, ivar1, gvar1)
interact_plot(mod, ivar1, gvar1, modx.values = c(85, 100, 115), modx.labels = c("-1", "0", "+1"))
sim_slopes(mod, ivar1, gvar1)

modw <- lmer(ivar2 ~ gvar1 * ivar1 + (1 | group_id), data = data, weights = sampling_weight)
summary(modw)

interact_plot(modw, ivar1, gvar1)
# roughly equivalent output
interact_plot(modw, ivar1, gvar1, modx.values = c(85, 100, 115), modx.labels = c("-1", "0", "+1"))
sim_slopes(modw, ivar1, gvar1)

edubwalsh added the bug Something isn't working label Sep 8, 2023

edubwalsh changed the title ~~Issue with inclusion of "weights" and "REML"~~ Issue with inclusion of "weights" Sep 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with inclusion of "weights" #74

Issue with inclusion of "weights" #74

edubwalsh commented Sep 8, 2023 •

edited

Loading

jacob-long commented Sep 19, 2023

edubwalsh commented Sep 19, 2023 •

edited

Loading

edubwalsh commented Oct 16, 2023

jacob-long commented Jan 7, 2024 •

edited

Loading

jacob-long commented Jan 7, 2024

Issue with inclusion of "weights" #74

Issue with inclusion of "weights" #74

Comments

edubwalsh commented Sep 8, 2023 • edited Loading

jacob-long commented Sep 19, 2023

edubwalsh commented Sep 19, 2023 • edited Loading

edubwalsh commented Oct 16, 2023

jacob-long commented Jan 7, 2024 • edited Loading

jacob-long commented Jan 7, 2024

edubwalsh commented Sep 8, 2023 •

edited

Loading

edubwalsh commented Sep 19, 2023 •

edited

Loading

jacob-long commented Jan 7, 2024 •

edited

Loading