Review practices #95

gdalle · 2024-11-13T22:34:57Z

Given the conversation in #94, especially this comment and following ones, I believe it would be beneficial to discuss our reviewing expectations for ADTypes. This repo has become a core component of the Julia ecosystem as a whole, and many people depend on it beyond SciML. In particular, it is tightly integrated with DifferentiationInterface, which probably uses ADTypes more thoroughly than any other package.
As the author of DI, I would appreciate being able to review meaningful changes made here, in order to make sure that they do not affect or break my differentiation code, which many downstream users rely upon. To me, "being able to review" means getting at least 24h to make comments and suggest changes (ideally 48h) after a non-trivial PR is opened.
Do you think this is a reasonable request? @ChrisRackauckas @avik-pal @Vaibhavdixit02

ChrisRackauckas · 2024-11-13T23:39:01Z

An open source project is not about specific people. People come and go. It's about systems. It's just like any production coding environment: if you aren't comfortable swapping in different talented people into the spot, then there's a problem with a code and the systems and that should be corrected.

Requiring a specific person to be involved is a very toxic environment for open source projects which we have seen in many instances causes people to simply not feel welcome to a project or leave. Claiming territory is one of the death sentences to a project, since people's lives change and so you have to have a natural flow of people in and out of a project. I have never seen anything good come out of someone demanding to be a reviewer, though I have seen plenty of cases where said demanding review ultimately leaves, some PRs go stale, and no one feels like they have the prerogative to ever do anything on the project again. For this reason, this kind of behavior is not simply not permitted, though I understand this isn't with ill-intent so there's a pass there.

That said, let's identify the root cause of the real issue. What do you actually want?

in order to make sure that they do not affect or break my differentiation code, which many downstream users rely upon

See principle number 1 in https://www.youtube.com/watch?v=fQJrEQIRPZU . "Trust your tests". It sounds like your issue is that you don't trust the tests? It sounds like you could not blindly merge a PR of all greens, knowing that no tests were removed, and still be confident that your systems are okay? If that's the case, then that's a test issue. And I agree we have a test issue here. The solution to a test issue is not for @gdalle to have to download it to his computer and run DI with it and give everyone the okay, that's not a scalable solution, nor will it do well with future hand-offs. The way forward has to be to build tests that we can all trust.

In particular, ADTypes.jl does need more extensive downstream testing. Interface packages are by far the hardest packages in terms of downstream testing, since they are likely to have the least comprehensive tests of the functionality but by far the most impact. They should be covered by downstream tests. We should run the appropriate DI, SciMLSensitivity, and Optimization tests at the minimum as part of the test suite here. Any others that we should add? Our efforts should be placed there building that out, not manually pulling everything local to double check each PR on specific downstream elements.

Ultimately every aspect of the project should be able to grow beyond you or I or anyone else. We all will find other priorities, move away, or pass on without notice. Anyone sufficiently trusted who knows the domain should be able to be given the reins and you and they should be able to be comfortable, otherwise we need to improve the system. If someone has to tread lightly every time they touch something that could break DI simply because they have no idea if they break it, that's a structural issue. Let's fix it so we all feel more comfortable prancing around on the dance floor provided by our tests.

gdalle · 2024-11-14T06:29:27Z

First of all, thank you for taking the time to answer my concerns. I agree with a lot of what you said, and just want to react to a few aspects.

An open source project is not about specific people. People come and go. It's about systems. It's just like any production coding environment: if you aren't comfortable swapping in different talented people into the spot, then there's a problem with a code and the systems and that should be corrected.

In theory, that would be lovely. In practice, many Julia projects I'm aware of have a bus factor of 1. Take away the lead maintainer, and things fall apart. So while I agree that systems can and should be improved, it may be quite idealistic to pretend that, as a community, we don't depend on individual people. As long as know-how is still concentrated among these people (which, again, is not a good thing), their feedback remains valuable.

Anyone with sufficient rights over this repo can merge a PR. This probably includes a lot of SciML admins who have never even touched it. As a result, I would feel more comfortable if the merge had to be approved by someone who has experience with this repo. That is luckily the case for #94, but in general we may need a specification of code owners to settle these debates unambiguously.

I have never seen anything good come out of someone demanding to be a reviewer, though I have seen plenty of cases where said demanding review ultimately leaves, some PRs go stale, and no one feels like they have the prerogative to ever do anything on the project again.

The key difference is that my suggestion came with a time limit. If that delay expires, then the "veto right" (or whatever you want to call it) expires with it, and the PR does not go stale if it is approved by another code owner. I have seen my fair share of stale PRs in the Graphs ecosystem, many because of me, and I don't want to make the same mistakes here.

Beyond the question of any individual having a veto, I want to stress that 24h (or even 48h) should be a very reasonable time-to-review in all of the software we're writing. Expecting a contribution to be merged within minutes or hours is simply not reasonable, and as you have mentioned over in #94, it leads to crazy hours for those who try to provide this level of reactivity.

See principle number 1 in https://www.youtube.com/watch?v=fQJrEQIRPZU. "Trust your tests". It sounds like your issue is that you don't trust the tests?

I'll take a look at that video, I've been meaning to watch it for a while now. I do trust the ADTypes tests, but I don't trust what other packages (including DI) do with ADTypes beyond our public API. This discussion had allowed me to see that, which is why I have opened #96 to explain my main fear in terms of unsanctioned use.

We should run the appropriate DI, SciMLSensitivity, and Optimization tests at the minimum as part of the test suite here. Any others that we should add?

That would be a good idea. I think Turing should also be thrown in the mix?

If someone has to tread lightly every time they touch something that could break DI simply because they have no idea if they break it, that's a structural issue. Let's fix it so we all feel more comfortable prancing around on the dance floor provided by our tests.

Agreed. If DI was only using the public API of ADTypes, I'd feel a lot better skipping review on package changes. Fixing this requires additions to the documentation and tests of ADTypes, so let's get to it.

ChrisRackauckas · 2024-11-14T07:24:06Z

Beyond the question of any individual having a veto, I want to stress that 24h (or even 48h) should be a very reasonable time-to-review in all of the software we're writing. Expecting a contribution to be merged within minutes or hours is simply not reasonable, and as you have mentioned over in #94, it leads to crazy hours for those who try to provide this level of reactivity.

No, that is still such a you you you mindset. You read that it's not a good thing to have to always be available, and interpreted it as "well then I need 24h if I'm going to look at every PR". You added a major extra assumption to this: why do you have to be involved in everything? You can choose to get involved in every PR, but forcing that you have to be involved in every PR is then the extra step that goes beyond what's normal. You're here for what you show up for, you're not for what you don't. That's okay, it's a project with multiple people. You cannot collaborate without some level of trust.

In theory, that would be lovely. In practice, many Julia projects I'm aware of have a bus factor of 1. Take away the lead maintainer, and things fall apart. So while I agree that systems can and should be improved, it may be quite idealistic to pretend that, as a community, we don't depend on individual people. As long as know-how is still concentrated among these people (which, again, is not a good thing), their feedback remains valuable.

It's not just Julia projects but open source in general. But also, note that this is the exact behavior that leads to open source projects having a bus factor of 1. If there is one person that feels like they have to be the arbiter of truth, that everything is only true or false if it goes through their head, then ultimately there is only ever one maintainer. Any extra maintainer is just an illusion of another maintainer since you still have to ask the other person for permission. That's not even BDFL, a BDFL just means someone has the say to triage disputes. This is an even stronger property that nothing is allowed to be done unless you're around. You see how that gives a bus factor of 1? It discourages anybody for learning or contributing since ultimately, no matter how much someone else gets involved, it's always going to be rate-bound by you, so why even care to make contributions? Why not let you get to it?

We've seen this time and time again. Bus factor of 1 is a policy and community choice, it's a symptom of systems being designed to filter through a single person. You could say you care about bus factors but ultimately if you are too scared to have anyone merge without talking to you first, the words are meaningless because that step means that the system has a crucial one person step.

Bus factors and community growth is about building trust. And just like in any relationship, the first step to building trust is you have to give trust. Yes, sometimes when someone is new on the scene you might find that you have an area that is under-tested and you might find that it takes someone a little bit to get their sea legs, but you have to allow someone to walk before they can run.

The key difference is that my suggestion came with a time limit. If that delay expires, then the "veto right" (or whatever you want to call it) expires with it, and the PR does not go stale if it is approved by another code owner.

We have a repo with a bus factor around 5 here, why would we purposefully change that to 1? There are trusted developers who have been working and building on this package long before you ever showed up. It's nice that you are now doing contributions and maintenance, but why does that mean there are no other maintainers? These are trusted folk who have been maintaining packages for years, you know they will ask questions and ping you if they go into territory they haven't seen before. Why can't you trust anybody?

I have seen my fair share of stale PRs in the Graphs ecosystem, many because of me, and I don't want to make the same mistakes here.

(Light)Graphs is a nice example of a package has had people wanting to become active but has failed to achieve a stable bus factor around 1 due to this kind of funneling behavior happening for nearly a decade, just passing around who had the funnel. I talked with Seth around this when I first designed the trait system in the Gitter chats, where the design was done months before it ever landed because of organizational concerns. In particular, the matter of review turnover tends to have folks then put more things into PRs to be reviewed, and this coupled with a "go through the maintainer" mentality leads to large PRs that effectively stall projects. This was pointed out fairly early on sbromberger/LightGraphs.jl#576 but always lingered as a concern. I for one had already moved onto other things by the time the PR ever got close to merging. This plus a few other things (like MATLAB.jl's fiasco) led to the policies in the first place.

Anyone with sufficient rights over this repo can merge a PR. This probably includes a lot of SciML admins who have never even touched it. As a result, I would feel more comfortable if the merge had to be approved by someone who has experience with this repo. That is luckily the case for #94, but in general we may need a specification of code owners to settle these debates unambiguously.

What admin are you calling out that would merge code without knowing the codebase? Most of the admins have been a part of ADTypes since its inception, while the others simply don't touch the package or work on other elements like devops. People would not have admin if they were just going around and merging things they don't know about and breaking things. People who have this have been around the project for on average more than 5 years and while we all make mistakes, they very clearly have a long track record of trust building throughout many repos. Why do you think they would suddenly not care about the health of the organization in this specific instance?

And again, why is the first thought to be putting limitations on maintainer growth, rather than on growing new maintainers? Code owners files are not supposed to be used in this way in an open source project. They are supposed to be used in the exact opposite way, where for a project that is having difficulty finding someone who feels close enough to the code to review, it gives a positive affirmation that they are at least the ones who know it best so it's worth a try. It's not supposed to be used as a tool to instead mean that the code always goes through them!

the "veto right"

Please give trust a try. Allow people to build and grow. Again, you are someone who we have trusted just the same as anyone else. There were years of discussions that led to this repo in the first place, many unstated contributions from people who don't show up as commits in this repo. You are still a relative newcomer to the project and we have given the same trust to you which we ask that you give to others who demonstrate the prerequisite ability.

If we run into major issues, then we immediately yank and discuss. Given how rarely that actually has to happen for all of the releases we have, averaging around 5 releases a day across the different code bases and seeing a yank maybe every few months (and almost always due to missing tests and ill-defined public API behavior), I would not trade this for a bus factor of 1 any day, no hesitation. It won't be perfect, we will have disagreements, but ultimately you or I won't need to be the arbiter of truth because our needs will be sealed in the interfaces and tests of the repository. If we cannot get the codebase to that, then it's not a collaborative project.

That would be a good idea. I think Turing should also be thrown in the mix?

Yeah makes sense.

Agreed. If DI was only using the public API of ADTypes, I'd feel a lot better skipping review on package changes. Fixing this requires additions to the documentation and tests of ADTypes, so let's get to it.

Would the DI tests not catch this?

gdalle · 2024-11-14T07:54:18Z

Would the DI tests not catch this?

The DI tests can only catch this if they run. Suppose ADTypes tags a release on Monday, and the next time DI's CI runs is on Friday (because no one submits any PR in the meantime). If ADTypes accidentally broke an internal that DI depends on, we could have undetected breakages for the whole week. I have opened #97 to keep track of downstream testing. And again, the root issue is over-reliance on internals, which should be fixed by #96.

Bus factors and community growth is about building trust. And just like in any relationship, the first step to building trust is you have to give trust.

You're right. I am scared to relinquish control of things I built or things I depend on. But review requirements would be part of the problem, not part of the solution. Consider my request cancelled.
I certainly hope that people will ping me if they need my opinion, but I don't actually have to hope. I can just trust you, and Vaibhav, and Avik, and others, to do that. After all, you are maintainers/admins for a reason.
I'll see what we can do to increase the bus factor for DI, since @adrhill and I may not always have time. Maybe we could organize an onboarding tutorial, for people who have already contributed to learn the ropes and gradually become more comfortable with the package. That's the actual way forward, and you helped me see that. Thanks.

ChrisRackauckas mentioned this issue Nov 13, 2024

feat: add ConstructionBaseExt to allow Setfield and Functors support #94

Merged

5 tasks

gdalle mentioned this issue Nov 14, 2024

Downstream tests #97

Open

gdalle closed this as completed Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review practices #95

Review practices #95

gdalle commented Nov 13, 2024 •

edited

Loading

ChrisRackauckas commented Nov 13, 2024

gdalle commented Nov 14, 2024 •

edited

Loading

ChrisRackauckas commented Nov 14, 2024

gdalle commented Nov 14, 2024 •

edited

Loading

Review practices #95

Review practices #95

Comments

gdalle commented Nov 13, 2024 • edited Loading

ChrisRackauckas commented Nov 13, 2024

gdalle commented Nov 14, 2024 • edited Loading

ChrisRackauckas commented Nov 14, 2024

gdalle commented Nov 14, 2024 • edited Loading

gdalle commented Nov 13, 2024 •

edited

Loading

gdalle commented Nov 14, 2024 •

edited

Loading

gdalle commented Nov 14, 2024 •

edited

Loading