Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage of (large) test data #277

Open
athowes opened this issue Sep 3, 2024 · 9 comments
Open

Storage of (large) test data #277

athowes opened this issue Sep 3, 2024 · 9 comments
Labels
low For a future release

Comments

@athowes
Copy link
Collaborator

athowes commented Sep 3, 2024

Now for some data sets used in tests we are creating them with inst/generate_examples.R then storing them in inst/extdata. At the moment this is fit.rds and fit_gamma.rds. These files are above the 50.00 MB that GitHub recommends as maximum file size.

One option is to thin down these fits to make them smaller.

Another option is that this is the wrong approach and we should rethink where we store data for tests / how we are approaching this somehow.

@athowes athowes added the low For a future release label Sep 3, 2024
@seabbs
Copy link
Contributor

seabbs commented Sep 3, 2024

One option is to thin down these fits to make them smaller.

In the first instance thin down and yes its definitely a "wrong" approach but I think it works for now. We should have an issue to explore alternatives. IMO I can't see a reason these would need to be this big

@athowes
Copy link
Collaborator Author

athowes commented Sep 3, 2024

I'm happy with closing this issue on thinning then adding another issue for alternatives.

@seabbs
Copy link
Contributor

seabbs commented Sep 3, 2024

I just looked and it seems like we could largely replace this approach by running a model fit in setup.R?

@seabbs
Copy link
Contributor

seabbs commented Sep 4, 2024

Note that setup.R always runs before the test suite runs and so anything in it is always available.

@athowes
Copy link
Collaborator Author

athowes commented Sep 4, 2024

The downside is that when I am doing things locally I run setup.R to generate the objects needed. So ideally there wouldn't be long running things in there (like model fitting)

@seabbs
Copy link
Contributor

seabbs commented Sep 4, 2024

but surely any fit we need for a test is going to be <30 seconds? There really doesn't seem like a need for more?

@athowes
Copy link
Collaborator Author

athowes commented Sep 4, 2024

The Gamma one is not <30 seconds currently.

And If we are intending to do "parameter recovery" integration tests then they can't be bad fits.

@seabbs
Copy link
Contributor

seabbs commented Sep 4, 2024

That doesn't seem ideal and it feels like the model should be workable with 30 seconds a core so I find this surprising?

@athowes
Copy link
Collaborator Author

athowes commented Sep 4, 2024

I might not have been setting cores as argument. I can post here what the actual runtimes are.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
low For a future release
Projects
None yet
Development

No branches or pull requests

2 participants