-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] Neat and automated transfer learning with OPTIMADE API for auto-adjusted problem-specific ML model generation on the fly #16
Conversation
…n accident and gets an error
…; meant mostly for tuning to smaller datasets
…; meant mostly for tuning to smaller datasets
…ation in`OPTIMADEAdjuster`
… a parameter with documentation in `OPTIMADEAdjuster`
…e in the top `__init__`
… now collecting names, target data, and featurizing the obtained structures
…adjustment, to display if a datapoint has been shown to the model at adjustment. Displayed when plotting unadjusted and adjusted models.
…sed right now but important for future methods including crystALL
Notes:
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16 +/- ##
==========================================
- Coverage 94.84% 93.58% -1.27%
==========================================
Files 17 19 +2
Lines 1999 2432 +433
==========================================
+ Hits 1896 2276 +380
- Misses 103 156 +53 ☔ View full report in Codecov by Sentry. |
…Labels` if no-validation adjustments were ran
…ts all functions except for the optional, non-default ClearML connectivity
Hi @jwsiegel2510 and @rdamaral Everything is complete and the tests are passing. It's ready to be reviewed! |
Hi @jwsiegel2510 and @rdamaral, I was hoping to pull it later today to align with the manuscript posting on arXiv. |
Hi Adam, I've reviewed the documentation, tested the main functions, and they are working well. I also did not encounter any issues when installing this branch version in a new conda environment (Python 3.10). Just a couple of comments:
|
…mproved description of the parameter to make it clear higher number is desired if GPU is present
…ror message is more clear to the user
…y the base URL to use and point to custom provider or a specific sub-database of a provider
Hi @rdamaral ! Thanks for the insightful comments :)
|
I also added a new functionality that allows you to override ma = pysipfenn.OPTIMADEAdjuster(
c,
model="SIPFENN_Krajewski2022_NN30",
endpointOverride=["https://alexandria.icams.rub.de/pbesol"],
targetPath=['attributes', '_alexandria_formation_energy_per_atom']
)
ma.fetchAndFeturize(
'elements HAS "Hf" AND elements HAS "Mo" AND elements HAS "Zr"',
parallelWorkers=2
) |
Nice. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, it seems reasonable to me to assume that each task will take the same amount of time since we are simply training the same network with different hyper parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, I guess the previous two commits were meant to be one.
As the title says, this new addition to the core
pySIPFENN
functionalities connects it to OPTIMADE API to enable rapid adjustment of the models to any specific dataset described by an OPTIMADE query (or multiple queries). Most of the functions are neatly hidden behind high-level API and default values should work well for datasets between 100-10,000 datapoints.You can now simply:
or to perform a hyperparameter search, replace the
ma.adjust()
with:All model usage works as before with the
Calculator
class. Modifying or exporting it for later is through specific classes in themodelExporters
submodule.