Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with inclusion of "weights" #74

Open
edubwalsh opened this issue Sep 8, 2023 · 5 comments
Open

Issue with inclusion of "weights" #74

edubwalsh opened this issue Sep 8, 2023 · 5 comments
Labels
bug Something isn't working

Comments

@edubwalsh
Copy link

edubwalsh commented Sep 8, 2023

Greetings, when running the program with lmer function (and formula for multilevel modeling), the output changes based on the inclusion (or exclusion) of "weights").

For example, when running the following, the mean, +1SD, -1SD will be incorrectly calculated:

Example 1:

      mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                       fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                       LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                       data = mod_pred, weights = weights, REML = TRUE) 

      # JN analysis and plots:
      jn <- probe_interaction(model=mod_jn, 
                       pred = LupTime,
                       modx = fMRI_Diff_RPut_FB_Slow_Win,
                       digits = getOption("jtools-digits", 2),
                       jnplot = TRUE,
                       confint = TRUE
                       )

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.2538121 (- 1 SD): 

   Est.   S.E.    2.5%   97.5%   t val.      p
------- ------ ------- ------- -------- ------
  -0.28   0.09   -0.46   -0.09    -2.97   0.00

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.5633464 (Mean): 

   Est.   S.E.    2.5%   97.5%   t val.      p
------- ------ ------- ------- -------- ------
  -0.45   0.06   -0.57   -0.33    -7.20   0.00

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.8728807 (+ 1 SD): 

   Est.   S.E.    2.5%   97.5%   t val.      p
------- ------ ------- ------- -------- ------
  -0.62   0.12   -0.86   -0.38    -5.14   0.00

If I remove "weights" from the code, the mean, +1SD, -1SD will be CORRECTLY calculated and the plot appears as expected. this doesn't represent the complete formula for my model. Additionally, weighting seems most appropriate.

Example 2:

      mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                       fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                       LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                       data = mod_pred, REML = TRUE) 

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = -0.36811581 (- 1 SD): 

   Est.   S.E.    2.5%   97.5%   t val.      p
------- ------ ------- ------- -------- ------
  -0.12   0.14   -0.39    0.14    -0.91   0.37

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.07231536 (Mean): 

  Est.   S.E.    2.5%   97.5%   t val.      p
------ ------ ------- ------- -------- ------
  0.05   0.07   -0.09    0.20     0.73   0.47

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.51274654 (+ 1 SD): 

  Est.   S.E.    2.5%   97.5%   t val.      p
------ ------ ------- ------- -------- ------
  0.23   0.14   -0.04    0.50     1.69   0.09

Finally, if I leave the full formula and attempt to fix the mod.x values, the lines appear flipped -- that is, where -1 SD should be, +1 SD is, and vice versa, on the interaction plot. This is in contrast to both Example 2 (above) and ggplots (accounting for weights).

Example 3:

      jn <- probe_interaction(model=mod_jn, 
                       pred = LupTime,
                       modx = fMRI_Diff_RPut_FB_Slow_Win,
                       modx.values = c(-0.37, 0.07, 0.51),
                       modx.labels = c("-1 SD", "Mean", "+1 SD"),
                       digits = getOption("jtools-digits", 2),
                       jnplot = TRUE,
                       confint = TRUE
                       )

Any insight on what might be driving this discrepancy when using "weights" in the formula? Thank you for your time!

example2
example3

@edubwalsh edubwalsh added the bug Something isn't working label Sep 8, 2023
@edubwalsh edubwalsh changed the title Issue with inclusion of "weights" and "REML" Issue with inclusion of "weights" Sep 8, 2023
@jacob-long
Copy link
Owner

In my brief attempt to replicate the problem, I wasn't able to. However, I have a suspicion that you could check for me. Does this issue keep coming up if you use sim_slopes() instead of probe_interaction()?

@edubwalsh
Copy link
Author

edubwalsh commented Sep 19, 2023

Thank you for the response and help! I appreciate it. Yes, unfortunately, the issue replicates when using sim_slopes() and interact_plot(). Below is output using sim_slopes().

Example 1, the mean and SDs are incorrectly calculated when using "weights" in the formula:

       mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                        fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                        LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                        data = mod_pred, weights = weights, REML = TRUE) 

       jn <- sim_slopes(model=mod_jn, 
                        pred = LupTime,
                        modx = fMRI_Diff_RPut_FB_Slow_Win
                        )
        print(jn)

Output from Example 1:

Warning: 0.872880688222861 is outside the observed range of fMRI_Diff_RPut_FB_Slow_Win
JOHNSON-NEYMAN INTERVAL 

When fMRI_Diff_RPut_FB_Slow_Win is INSIDE the interval [0.16, 63.81],
the slope of LupTime is p < .05.

Note: The range of observed values of fMRI_Diff_RPut_FB_Slow_Win is
[-1.20, 0.75]

SIMPLE SLOPES ANALYSIS 

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.2538121 (- 1 SD): 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.28   0.09    -2.97   0.00

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.5633464 (Mean): 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.45   0.06    -7.20   0.00

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = 0.8728807 (+ 1 SD): 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.62   0.12    -5.14   0.00

Example 2, when removing weights from the formula (and with or without including REML = TRUE), the mean, SDs are correctly calculated, though the JN interval could not be found:

       mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                        fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                        LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                        data = mod_pred, REML = TRUE) 
 
       jn <- sim_slopes(model=mod_jn, 
                        pred = LupTime,
                        modx = fMRI_Diff_RPut_FB_Slow_Win
                        )
       print(jn)

Output for Example 2:

JOHNSON-NEYMAN INTERVAL 

The Johnson-Neyman interval could not be found. Is the p value for your
interaction term below the specified alpha?

SIMPLE SLOPES ANALYSIS 

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = -0.36811581 (- 1 SD): 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.12   0.14    -0.91   0.37

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.07231536 (Mean): 

  Est.   S.E.   t val.      p
------ ------ -------- ------
  0.05   0.07     0.73   0.47

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.51274654 (+ 1 SD): 

  Est.   S.E.   t val.      p
------ ------ -------- ------
  0.23   0.14     1.69   0.09

Example 3 uses weights and fixes the modx values. Though, the same issue is occuring, where -1 SD should be significant, but it is showing as +1 SD.

      mod_jn <- lmer(IDAS_illtemper_SUM_final ~ LupTime + BL_Mean_BOLD_RPut_FB_Slow_Win + 
                       fMRI_Diff_RPut_FB_Slow_Win + age + BL_Mean_BOLD_RPut_FB_Slow_Win * 
                       LupTime + fMRI_Diff_RPut_FB_Slow_Win * LupTime + (1 | Subject), 
                       data = mod_pred, weights = weights, REML = TRUE)  

      jn <- sim_slopes(model=mod_jn, 
                       pred = LupTime,
                       modx = fMRI_Diff_RPut_FB_Slow_Win,
                       modx.values = c(-0.37, 0.07, 0.51),
                       )
        print(jn)
        

Output from Example 3:

JOHNSON-NEYMAN INTERVAL 

When fMRI_Diff_RPut_FB_Slow_Win is INSIDE the interval [0.16, 63.81],
the slope of LupTime is p < .05.

Note: The range of observed values of fMRI_Diff_RPut_FB_Slow_Win is
[-1.20, 0.75]

SIMPLE SLOPES ANALYSIS 

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win = -0.37: 

  Est.   S.E.   t val.      p
------ ------ -------- ------
  0.07   0.25     0.28   0.78

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.07: 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.17   0.14    -1.27   0.21

Slope of LupTime when fMRI_Diff_RPut_FB_Slow_Win =  0.51: 

   Est.   S.E.   t val.      p
------- ------ -------- ------
  -0.42   0.06    -6.98   0.00

@edubwalsh
Copy link
Author

Checking in to see if you have any additional thoughts on this issue? It does occur with another dataset in our lab, again, when using weights in the lmer model. We tried renaming "weights" column in the event it was conflicting with a variable in the source code, however, it did not change the outcome. Thank you!

@jacob-long
Copy link
Owner

jacob-long commented Jan 7, 2024

Focusing for the moment on the issue of correctly calculating mean and SD, I am unable to replicate it. Here's my code to generate a fake multilevel dataset with sampling weights then do some basic analyses that don't seem to include mistakes:

library(dplyr)

# Set the seed for reproducibility
set.seed(123)

# Define the number of groups and the number of individuals per group
n_groups <- 10
n_individuals_per_group <- 20

# Create a multilevel dataset
data <- expand.grid(group_id = 1:n_groups, individual_id = 1:n_individuals_per_group)

# Generate some group-level and individual-level variables
data$gvar1 <- with(data, ave(group_id, group_id, FUN = function(x) rnorm(1, mean = 100, sd = 15)))
data$gvar2 <- with(data, ave(group_id, group_id, FUN = function(x) rnorm(1, mean = 100, sd = 15)))
data$ivar1 <- rnorm(nrow(data), mean = 100, sd = 15)
data$ivar2 <- data$ivar1 + data$gvar1 + (data$ivar1 * data$gvar1)/100 + rnorm(nrow(data), 0, 15)

# Generate sampling weights (these can be random but should vary)
data$sampling_weight <- runif(nrow(data), min = 0.5, max = 2)

library(lme4)
mod <- lmer(ivar2 ~ gvar1 * ivar1 + (1 | group_id), data = data)
summary(mod)

interact_plot(mod, ivar1, gvar1)
sim_slopes(mod, ivar1, gvar1)

modw <- lmer(ivar2 ~ gvar1 * ivar1 + (1 | group_id), data = data, weights = sampling_weight)
summary(modw)

interact_plot(modw, ivar1, gvar1)
sim_slopes(modw, ivar1, gvar1)

In the case of your output, I suppose my issue is that it's not clear that it is wrong about the slopes, means, SDs because the output doesn't tell me that. I expect it to be different depending on the inclusions of weights because then we are calculating weighted means and SDs. Of course, without the data I can't be sure whether those weighted means and SDs are in fact the correct weighted means and SDs, I can only check it on my contrived example where I can verify it works right.

I'll look into the issue about fixing the moderator values separately and report back.

@jacob-long
Copy link
Owner

Looking into the fixed values/labels thing, I'm not replicating an issue there either --- code below which, when run, seems to show the labels are applying correctly.

In the output you shared and the one plot for the weighted model, the sim_slopes() output and plot seem to agree; a negative slope at lower values of the moderator that becomes positive as the moderator increases.

library(dplyr)

# Set the seed for reproducibility
set.seed(123)

# Define the number of groups and the number of individuals per group
n_groups <- 10
n_individuals_per_group <- 20

# Create a multilevel dataset
data <- expand.grid(group_id = 1:n_groups, individual_id = 1:n_individuals_per_group)

# Generate some group-level and individual-level variables
data$gvar1 <- with(data, ave(group_id, group_id, FUN = function(x) rnorm(1, mean = 100, sd = 15)))
data$gvar2 <- with(data, ave(group_id, group_id, FUN = function(x) rnorm(1, mean = 100, sd = 15)))
data$ivar1 <- rnorm(nrow(data), mean = 100, sd = 15)
data$ivar2 <-(data$ivar1 + data$gvar1 + (data$ivar1 * data$gvar1)/100) + rnorm(nrow(data), 0, 15)

# Generate sampling weights (these can be random but should vary)
data$sampling_weight <- runif(nrow(data), min = 0.5, max = 2)

library(lme4)
mod <- lmer(ivar2 ~ gvar1 * ivar1 + (1 | group_id), data = data)
summary(mod)

interact_plot(mod, ivar1, gvar1)
interact_plot(mod, ivar1, gvar1, modx.values = c(85, 100, 115), modx.labels = c("-1", "0", "+1"))
sim_slopes(mod, ivar1, gvar1)

modw <- lmer(ivar2 ~ gvar1 * ivar1 + (1 | group_id), data = data, weights = sampling_weight)
summary(modw)

interact_plot(modw, ivar1, gvar1)
# roughly equivalent output
interact_plot(modw, ivar1, gvar1, modx.values = c(85, 100, 115), modx.labels = c("-1", "0", "+1"))
sim_slopes(modw, ivar1, gvar1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants