Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the p_value of the whole model #77

Open
SHEN-Cheng opened this issue Nov 4, 2020 · 6 comments
Open

How to get the p_value of the whole model #77

SHEN-Cheng opened this issue Nov 4, 2020 · 6 comments

Comments

@SHEN-Cheng
Copy link

yeah, through my_pwlf.p_values() i can get the calculate the p-value for each beta parameter. Like first the beta parameters (intercept + slopes) and then the breakpoints.
but how to get the whole model p_value ?

@cjekel
Copy link
Owner

cjekel commented Nov 24, 2020

I just created an example that adds a test for model significance, and get's a p-value for the entire model. https://github.com/cjekel/piecewise_linear_fit_py/blob/master/examples/test_for_model_significance.py

As defined in Section 2.4.1 of Myers RH, Montgomery DC, Anderson-Cook CM. Response surface methodology . Hoboken. New Jersey: John Wiley & Sons, Inc. 2009;20:38-44.

In the linear model case we setup a hypothesis test as:
image

In the non-linear model case, we'll include the breakpoints as beta parameters. (since the breakpoints are unknown model parameters).

You reject H0 when p-values are less than some alpha.

Please leave this issue open, as the object should include this method!

@SHEN-Cheng
Copy link
Author

I just created an example that adds a test for model significance, and get's a p-value for the entire model. https://github.com/cjekel/piecewise_linear_fit_py/blob/master/examples/test_for_model_significance.py

As defined in Section 2.4.1 of Myers RH, Montgomery DC, Anderson-Cook CM. Response surface methodology . Hoboken. New Jersey: John Wiley & Sons, Inc. 2009;20:38-44.

In the linear model case we setup a hypothesis test as:
image

In the non-linear model case, we'll include the breakpoints as beta parameters. (since the breakpoints are unknown model parameters).

You reject H0 when p-values are less than some alpha.

Please leave this issue open, as the object should include this method!

Great! You slove my problem.

@kM-Stone
Copy link

kM-Stone commented Sep 7, 2021

I just created an example that adds a test for model significance, and get's a p-value for the entire model. https://github.com/cjekel/piecewise_linear_fit_py/blob/master/examples/test_for_model_significance.py

As defined in Section 2.4.1 of Myers RH, Montgomery DC, Anderson-Cook CM. Response surface methodology . Hoboken. New Jersey: John Wiley & Sons, Inc. 2009;20:38-44.

In the linear model case we setup a hypothesis test as:
image

In the non-linear model case, we'll include the breakpoints as beta parameters. (since the breakpoints are unknown model parameters).

You reject H0 when p-values are less than some alpha.

Please leave this issue open, as the object should include this method!

Hi~ Thanks for the great work! I ran your code above, but I am confused for the result:

  • it's your last comment in code:

    in both these cases, the p_value is very large, so we can't reject H0

    Indeed, the results show large p-values for both case (0.85 and 0.95), but my_pwlf.p_values() shows array([1.17134878e-06, 7.30540082e-51, 1.00331376e-21]). So why each beta is significant but whole model not?

  • line 77: f0 = (ssr / k) / (sse / (n - k -1)),the F-statistics form is consistent with your reference, i.e.
    image
    but the ssr in code is seems sum of squared of error (section 2.1 formula 10 ) , instead of sum of squared of regression? I swaped ssr and sse in code, and got quite small p-values.

@cjekel
Copy link
Owner

cjekel commented Sep 7, 2021

Hi~ Thanks for the great work! I ran your code above, but I am confused for the result:

* it's your last comment in code:
  > in both these cases, the p_value is very large, so we can't reject H0
  
  
  Indeed, the results show large p-values for both case (0.85 and 0.95), but `my_pwlf.p_values()` shows `array([1.17134878e-06, 7.30540082e-51, 1.00331376e-21])`. So why each beta is significant  but whole model not?

Does the following change impact these results?

* line 77: `f0 = (ssr / k) / (sse / (n - k -1))`,the F-statistics form is consistent with your reference, i.e.
  ![image](https://user-images.githubusercontent.com/50538789/132291846-1e52a2e0-ed82-4f3b-8e61-8ee4daf31e02.png)
  but the ssr in code  is seems sum of squared of error (section 2.1 formula 10 ) , instead of sum of squared of regression? I swaped ssr and sse in code, and got quite small p-values.

Yup, nice catch! SSR in my code is actually SSE in that book, and vice versa. Sorry about this.

image

(look how this wiki article uses ESS and RSS, and the E and R in theses are swapped from the above book https://en.wikipedia.org/wiki/Explained_sum_of_squares )

@cjekel
Copy link
Owner

cjekel commented Sep 11, 2021

Hi~ Thanks for the great work! I ran your code above, but I am confused for the result:

* it's your last comment in code:
  > in both these cases, the p_value is very large, so we can't reject H0
  
  
  Indeed, the results show large p-values for both case (0.85 and 0.95), but `my_pwlf.p_values()` shows `array([1.17134878e-06, 7.30540082e-51, 1.00331376e-21])`. So why each beta is significant  but whole model not?

Does the following change impact these results?

The answer to this is yes. Fixed in 101711b Many thanks to @kM-Stone for catching this mistake.

@cjekel
Copy link
Owner

cjekel commented Sep 11, 2021

To clarify, all uses of ssr in PiecewiseLinFit are okay and don't need changing. Including PiecewiseLinFit.r_squared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants