Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculation of statistical inefficiency in the method equilibrium_detection #96

Closed
wehs7661 opened this issue Jan 24, 2020 · 4 comments
Closed

Comments

@wehs7661
Copy link
Contributor

Hi alchemlyb developers,
When checking the source code of the subsampling methods, I found that the statistical inefficiency was calculated twice in the method equilibrium_detection. Here are the relevant lines of code in the method (from line 189 to line 193):

# calculate statistical inefficiency of series
statinef  = statisticalInefficiency(series)

# calculate statistical inefficiency of series, with equilibrium detection
t, statinef, Neff_max  = detectEquilibration(series.values)

Apparently, the statistical inefficiency calculated in line 190 was immediately overwritten by the one calculated in line 193. Is line 190 actually redundant? Or is there any other purpose of doing that? Thank you!

@wehs7661
Copy link
Contributor Author

Also, I thought it could be worthy to elaborate on the purposes of using different subsampling methods in the documentation. To my understanding, equilibrium detection is the most conservative method, while the method statistical_inefficiency can only be used in the situation that the equilibrium region has already been truncated from df. I'm not sure if there is any other scenario that statistical_inefficiency should be used instead of equilibrium_detection, but I thought it would be nice if the documentation could provide some suggestions on this.

@dotsdl
Copy link
Member

dotsdl commented Jan 24, 2020

Hi @wehs7661, welcome to alchemlyb! I believe you are correct, the first calculation of statinef is completely redundant. I'm planning to show the subsamplers some love shortly, and I noticed this one last night too as it happens (and git blame puts the blame on me 😁).

You are welcome to axe the call to statisticalInefficiency in a PR as a contribution to the library if you'd like to beat me to it! Thanks for spotting it!

@wehs7661
Copy link
Contributor Author

Hi @dotsdl, thank you so much for your reply! I've already sent a pull request.

dotsdl added a commit that referenced this issue Jan 25, 2020
Removed redundant lines for calculating statistical inefficiency

Addresses #96
@dotsdl
Copy link
Member

dotsdl commented Jan 25, 2020

Merged! Thank you @wehs7661!

@dotsdl dotsdl closed this as completed Jan 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants