Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update: New Holodoodler dev fork for tests with new doodler-engine #38

Open
dbuscombe-usgs opened this issue Jun 17, 2022 · 7 comments
Open

Comments

@dbuscombe-usgs
Copy link

In the last day, I have started 3 new repos in order to integrate Dash- and Holo-Doodlers, as discussed on this thread

  1. https://github.com/Doodleverse/doodler_engine is a set of segmentation codes that will be shared between Dash-Doodler and Holo-Doodler.

  2. A new dev fork of Dash-Doodler https://github.com/dbuscombe-usgs/dash_doodler that already has doodler_engine integrated and is undergoing testing. If successful, it will be merged back into the main Doodleverse repo

  3. doodler_engine is a pip repository that installs inside an existing conda env with no dependencies (i.e. no problems!) - no documentation yet

  4. I started a new fork of Holo-Doodler, and modified the code to use doodler-engine functions. Initial tests indicate this works really well, and is a significant improvement in terms of speed and memory use, owing to changes I have made to the code I will document elsewhere. I added a couple of dependencies to the yml file, and modified components.py to accommodate the new doodler-engine. Codes formerly in holodoodler/doodler/segmentation are now surplus to requirements

I'll keep working/testing this.

@maximlt
Copy link
Collaborator

maximlt commented Jun 17, 2022

Wow that is a really nice start! Thanks for doing that, and for sure the performance improvement you've made to the engine will make the apps nicer to use.

I see what you've done with having the engine not declaring its depdencies yet, that's useful for development. I believe it should declare its own dependencies at some point. Publishing it to conda-forge won't however be possible until there's movement on the pydensecrf repo :/ Could you maybe shime in the issue I opened on their repo and explain what the purpose of Doodler is and why it needs to use pydensecrf? After all Doodler it's a nice project, that could motivate the maintainer.

Do you plan to upstream to dash-doodler the code that is now in the engine and deals with plotly? Of course that's not strictly necessary for HoloDoodler to work well but that seems like a good opportunity for separation of concerns, allowing dash-doodler to more easily make fixes/improvements to that part of the code without having to rely on new releases of the engine.

I haven't yet looked at the code changes you made in the fork but will do (on my phone now). I would just suggest to make the changes in a branch rather than on master, that can only help.

@dbuscombe-usgs
Copy link
Author

Thanks!

  1. I'll look into what it would take for the engine declaring its own dependencies - what advantages do you see?
  2. If I understand what you mean, yes I will continue to strip out any UI functions from the engine includign any plotly components.
  3. I dont have plans to publish doodler-engine to conda-forge, but mostly because I have no experience with that, and dont really know what advantages that would give us (I only know enough conda/pip to get by!)
  4. Yes, I'll chime in on the pydensecrf repo thread!
  5. Ok, I will work on a branch ... again, I dont always use git properly, so I appreciate the pointers - forgive my ignorance, but what advantage is there to work on a dev branch on my holodoodler fork? will it be easier to merge?

@2320sharon
Copy link

what advantage is there to work on a dev branch on my holodoodler fork? will it be easier to merge?

By making a dev branch you can have a branch available to make changes that may break the master branch without having to worry about breaking the working version of the code in the master branch. You can think of the dev branch as the place you make experimental changes that could break your working version of the code in master without having to worry if other people will unknowingly download the experimental version of your code. It lets you share experimental code with other developers before they are ready to be shared in master.

When you want to merge in a change from one of your dev branches into your master branch its a simple merge request that can be tracked in GitHub making it much easier to see when, where, and why certain changes were added to the master branch.

TLDR: Yes having a dev branch will make it easier to merge new changes into the master branch in the long run and let you share your experimental code without worrying if it will break your verified working version of the code in the master branch.

@dbuscombe-usgs
Copy link
Author

Thank you @2320sharon, I'm sold. Will work on a dev branch

@maximlt
Copy link
Collaborator

maximlt commented Jun 17, 2022

I'll look into what it would take for the engine declaring its own dependencies - what advantages do you see?

Oh many:

  • now the engine is on PyPi anyone could be interested in downloading and using it, as of now they will have to find out what dependencies it needs by looking at the code, and that may not be sufficient if there are implicit pins (e.g. numpy>1.15).
  • you may want to add some tests, even very basic, to make sure that the code does what you expect and feel safer to make changes in the future without breaking your expectations. Testing the engine on its own would require the package to declare its dependencies, to install them before executing the tests.
  • if a new dependency is added to the engine, there is no way for dash/holo-doodler to install it automatically, their dependencies would have to be updated accordingly.

If I understand what you mean, yes I will continue to strip out any UI functions from the engine includign any plotly components.

Yes it's exactly what I meant 👍

I dont have plans to publish doodler-engine to conda-forge, but mostly because I have no experience with that, and dont really know what advantages that would give us (I only know enough conda/pip to get by!)

Packages that have binary dependencies, like Numpy and many other scientific packages, have been a pain to install together. I think this got a lot better recently when using pip, yet the conda ecosystem still has an advantage as it was designed to solve exactly this issue. I also find that conda is a great tool to use in a research environment, it's multi-platform and is both an environment manager and package manager. This means that in a research lab you can easily teach your colleagues how to use it (same command whatever their OS) and show them how to manage their projects so that they are more reproducible by creating virtual environments. Of course I'm a little biased ;)

Yes, I'll chime in on the pydensecrf repo thread!

Thanks!

Ok, I will work on a branch ... again, I dont always use git properly, so I appreciate the pointers - forgive my ignorance, but what advantage is there to work on a dev branch on my holodoodler fork? will it be easier to merge?

+1 to what @2320sharon replied! 🙃

@dbuscombe-usgs
Copy link
Author

dbuscombe-usgs commented Jun 24, 2022

@maximlt
Copy link
Collaborator

maximlt commented Jun 24, 2022

Nice to see these updates! Let me know when you think your fork of holo-doodler is ready, or if you want me to take over that work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants