Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replacing hard-wired benchmark suites with benchmark factory #4

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

joergkurtwegner
Copy link

Dear all, could we please replace the hard-wired benchmark suites with a "benchmark factory" (software engineering) pattern. Here I chose a "singleton pattern" as "factory pattern".
Any benchmark version self-registers itself with a name like "v1, "v2, "trivial", but more importantly, it enables to register other benchmarks we might be interested in without changing the guacamol package code.

Penny for your thoughts.

Also, any thought on how we could load goal-oriented benchmarks from file rather? I realize its non-trivial, but maybe at least some we could load from files, as the hard-wired coding is too inflexible for accumulating bigger benchmarks we might be interested in.

@avaucher
Copy link
Contributor

avaucher commented May 1, 2019

Dear Jörg,

Thank you for the pull request and suggestions!

Before being able to consider this PR, our legal department tells us that a CLA is going to be necessary and they are putting one in place at the moment.

Can you let us know if you are contributing on behalf of yourself or your company?
This may make a difference for the CLA.

And to answer to your question: a basic mechanism for loading benchmark definitions from files should be quite easy to add once there is a factory mechanism as you suggest in this PR. To do so, a user could register a custom factory dealing with files from outside guacamol, if I see it correctly.

@joergkurtwegner
Copy link
Author

Please send me a copy of the CLA for review /.Joerg

@@ -55,6 +55,18 @@ def score_mol(self, mol: Chem.Mol) -> float:
fp = get_fingerprint(mol, self.fp_type)
return TanimotoSimilarity(fp, self.ref_fp)

def __hash__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where hashes of the scoring functions / score modifiers are used.

What use case do you have in mind?



# factory product
class AbstractFactoryProductGuacamol(BaseEstimator):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why inherit from BaseEstimator?

return GoalDirectedBenchmarkSuite().getBenchmark(version_name).getList()

# factory (as singleton software engineering pattern)
class GoalDirectedBenchmarkSuite(object):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully support the addition of such classes, but I don't see the need for a complex factory / singleton registration in such a simple case.

Why not something like:

registered_goal_directed_benchmarks = {}

def register_goal_directed_benchmark(name, benchmark_suite_cls):
    # [...] some checks, preprocessing, etc.
    registered_goal_directed_benchmarks[name] = benchmark_suite_cls

class MyGoalDirectedBenchmarkSuite(GoalDirectedBenchmarkSuite):
    [...]

register_goal_directed_benchmark('my_suite_v1', MyGoalDirectedBenchmarkSuite)

I am happy to update this myself in the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants