-
-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: improve freezing docs #2385
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2385 +/- ##
==========================================
- Coverage 83.55% 73.94% -9.62%
==========================================
Files 31 32 +1
Lines 1836 1911 +75
==========================================
- Hits 1534 1413 -121
- Misses 302 498 +196 ☔ View full report in Codecov by Sentry. |
Link to version with the markdown rendered: I wonder if this ought to be filed under |
It also overlaps with https://fluxml.ai/Flux.jl/stable/training/training/#Freezing-and-Schedules. I do think it's a little disjointed to have docs for both layer definition tools and dedicated freezing tools on a single "freezing" page. Ideally |
The reason i put this one all the way in the bottom is cause it requires the reading of functors first and seeing other ways of doing things. It's more like a meta overview of how to do things in each separate case. I tried to link where possible, but Im open to suggestions Also please double check if the docs are factually correct. My only reference is the slack discussion |
I specifically replaced that section, see the file diff |
Yes, I'm aware :) To explain why I feel this approach is disjointed by way of analogy, this would be like putting I think it could make sense to have a page talking about the last two, but having all three in one place doesn't make a lot of sense to me. Other than that, I think a "freezing params" page might not have enough content to justify its existence as a standalone page. If the scope were broadened slightly to working with parameters and optimization rules, that could work. Maybe leave some space to add docs on the various ways to do regularization and loss penalties. |
The last three pages "logistic regression" "linear regression" and "custom layers" all feature somewhat a disjointed set of functionality and act like a "putting it all together" sections or like extensions of other well-defined use-cases. To me it is fine that these pages are fine to be a mixed-bag because they are disconnected from the rest of the docs and marked as "tutorials" for like more advanced use-cases and clarifications in the docs that are otherwise hard to write in linear independent chunks. I still insist on having an independent category/page for freezing and everything that might link to it including functors and optimisers. There are examples in the wild like: I understand your sentiment, it is far from perfect right now. But I still feel like the users who read through the main docs kinda need a page or a section that would explain why and how each method is different in its own way. I also agree that expanding the scope to include regularization and loss penalties is worth trying. Could you give me more directions here? I could try writing this up. |
It's doesn't seem too crazy to make a tutorial describing concepts that Flux thinks of as orthogonal, all together... this shouldn't be the main intro to them but perhaps it's useful to for others? I quite like the examples here of why you might want each of them. One thing this could usefully expand on is post-#1932 changes... e.g. that On Edit: one more thing I found here: #2216 (comment) is that if you want to freeze a lot of the model, you may not need to compute the whole gradient. That's a bit obscure, but might be nice to explain on a tutorial-like page like this.
|
Yes, referencing a broad range of functionality works for a tutorial because there's a singular and clear end. This is less true for a "how-to guide" like this proposed freezing page and the two examples mentioned later. I agree the custom layers page is a mess and needs to be broken up. But separating any mention of
I would not say they're equivalent. The Lux page really talks about one main piece of functionality, The Flax page is more of a tutorial. We could use a comparable tutorial in the Flux docs, but that's a separate discussion from a how-to guide on parameter freezing.
I think the core contention here is that I don't see
I would just broaden the title and preamble for now and worry about adding this extra content in a follow-up. |
I tried to incorporate your suggestions in the last commit. Please take a look at the last subsection, and do let me know if it is factually correct |
any feedback on this @ToucheSir @mcabbott? |
Do you mind rebasing so we can see what this looks like on top of #2390? |
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 3 to 4. - [Release notes](https://github.com/codecov/codecov-action/releases) - [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md) - [Commits](codecov/codecov-action@v3...v4) --- updated-dependencies: - dependency-name: codecov/codecov-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [dorny/paths-filter](https://github.com/dorny/paths-filter) from 3.0.1 to 3.0.2. - [Release notes](https://github.com/dorny/paths-filter/releases) - [Changelog](https://github.com/dorny/paths-filter/blob/master/CHANGELOG.md) - [Commits](dorny/paths-filter@v3.0.1...v3.0.2) --- updated-dependencies: - dependency-name: dorny/paths-filter dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]>
* doc changes re at-functor and at-layer * fix a doctest * more fixes * public at-layer * add a sentence comparing to freeze/thaw * Apply suggestions from code review Co-authored-by: Kyle Daruwalla <[email protected]> * two fixes re SignDecay --------- Co-authored-by: Kyle Daruwalla <[email protected]>
Unfortunately I don't think the rebase was successful. Usually you'd have to force push the remote branch after rebasing, and only the changes from your four original commits on this PR should be visible on GitHub. |
@ToucheSir as per slack discussion
(Edit: closes #2216 )