Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a facility for unlocked director iteration #4142

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

nigoroll
Copy link
Member

@nigoroll nigoroll commented Jul 26, 2024

Pondering #4140 made me realize that our central per-vcl director list constitutes a classic case of iterating a mutable list. That besides the already known fact that running potentially expensive iterator functions while holding the vcl_mtx is a very bad idea.

So this patch is a suggestion on how to get out of this situation. It does not go all the way (it still leaves unsolved a similar problem of iterating over all backends of all vcls), but shows an attempt on how to tackle the "iterate over one VCL's backends".

We add a fittingly named 'vdire' facility which manages the director list and, in particular, director retirement. The main change is that, while iterators are active on the director list, vcl_mtx is not held and any request of a director to retire only ends up in a resignation, which manifests by this director being put on a second list.

When all iterators have completed, the resignation list is worked and the actual retirement carried out.

NOTE: I am particularly not happy about the vdire_(start|end)_iter() interface being exposed. What I would like to see is a FOREACH macro which does both the start and end. Having to use an iterator with a callback is clumsy, and having to directly call start/end with a VTAILQ_FOREACH feels like crossing interfaces. See #4142 (comment)

@nigoroll nigoroll force-pushed the vdire branch 2 times, most recently from febd306 to f79bd4c Compare July 31, 2024 12:51
@nigoroll
Copy link
Member Author

After rebasing on the newly understood 4e2c50d, I have successfully run my stress test on this patch without any issues.

@nigoroll nigoroll added the a=RunByUPLEX This PR is being run by UPLEX in production label Jul 31, 2024
@nigoroll
Copy link
Member Author

nigoroll commented Aug 1, 2024

What I would like to see is a FOREACH macro which does both the start and end

Here's a macro which does that:

/*
 * this macro might look a bit daunting, but it is really just two things:
 *
 * - the outer for loop is only to declare the local "next" variable
 *   and execute the inner for loop once
 *
 * - the inner for loop is VTAILQ_FOREACH_SAFE with calls to
 *   vdire_{start,end}_iter() added at the right places:
 *   start is called at initialization and end when var becomes NULL
 *   and the loop ends
 *
 * problem: none of break, goto or return must be used from the loop body
*/

#define _VDIRE_FOREACH(var, vdire, next)				\
	for (struct vcldir *next = (void *)&next;			\
	     next == (void *)&next;					\
	     next = NULL)						\
		for (vdire_start_iter(vdire),				\
		     (var) = VTAILQ_FIRST(&(vdire)->directors);		\
		     ((var) || (vdire_end_iter(vdire), 0)) &&		\
		       ((next) = VTAILQ_NEXT(var, directors_list), 1);	\
		     (var) = (next))

#define VDIRE_FOREACH(var, vdire)				\
	_VDIRE_FOREACH(var, vdire, VUNIQ_NAME(_vdirenext))

As stated in the comment, none of break, goto or return must be used from within the loop body. So I think I prefer the explicit vdire_{start,end}_iter()

@nigoroll nigoroll marked this pull request as ready for review September 30, 2024 17:20
@nigoroll
Copy link
Member Author

Putting this out of draft status despite not having worked on it any more, because it has received no feedback and I think it's actually important - now having seen one production case where this is at least a potential candidate.

@nigoroll nigoroll changed the title WIP/DRAFT Add a facility for unlocked director iteration Add a facility for unlocked director iteration Oct 10, 2024
@nigoroll
Copy link
Member Author

bugwash: good idea by phk: New iterators encountering retired backends should wait until they are cleared, otherwise a constant stream of iterators could prevent retirement. I do not think this is a realistic scenario at this point, but we might as well do this right from the start.

another feedback question is if we can simplify this patch with an r/w lock. I am sure that I did look at this for quite some time and came to the conclusion that it does not do much to help us, but I will revisit this question.

@nigoroll
Copy link
Member Author

New iterators encountering retired backends should wait until they are cleared

done: 41c42a0

another feedback question is if we can simplify this patch with an r/w lock.

I do not see the simplification. With an rwlock, vdire_resign() would try to wrlock and, if this fails, would need to start a thread which would then wrlock and do the remove.

@nigoroll nigoroll force-pushed the vdire branch 2 times, most recently from 6eb58c3 to 804fbee Compare November 9, 2024 19:51
@bsdphk
Copy link
Contributor

bsdphk commented Jan 15, 2025

I'm OK with this, as long as this iterator API is only for use by the CLI thread, and this should be documented in the code with an ASSERT_CLI() because allowing multiple thread to independently and concurrently iterate at the same time will require further analysis. (For instance, should we allow two different iterators to have their func called on the same director at the same time ?)

We should (probably) hand the resigned_list to a worker thread, rather than bog down the CLI thread with potentially many cleanup calls.

I think the iterator should skip directors which have already been retired, no point in the CLI processing them if they are already (potentially long since) gone.

Apart from that: OK with me.

Pondering varnishcache#4140 made me realize that our central per-vcl director list
constitutes a classic case of iterating a mutable list. That besides the already
known fact that running potentially expensive iterator functions while holding
the vcl_mtx is a very bad idea.

So this patch is a suggestion on how to get out of this situation. It does not
go all the way (it still leaves unsolved a similar problem of iterating over all
backends of _all vcls_), but shows an attempt on how to tackle the "iterate over
one VCL's backends".

We add a fittingly named 'vdire' facility which manages the director list and,
in particular, director retirement. The main change is that, while iterators are
active on the director list, vcl_mtx is _not_ held and any request of a director
to retire only ends up in a resignation, which manifests by this director being
put on a second list.

When all iterators have completed, the resignation list is worked and the actual
retirement carried out.
This is to prevent a potential pileup of resigned directors with a constant
stream of iterators coming on
@nigoroll
Copy link
Member Author

The next force-push has these changes (besides the rebase and typo fix):

skip resigned backends

we can identify these by a zero refcount. It is is already illegal to assign a director with a zero backend.

diff --git a/bin/varnishd/cache/cache_vcl.c b/bin/varnishd/cache/cache_vcl.c
index 12ed136a7..918537e50 100644
--- a/bin/varnishd/cache/cache_vcl.c
+++ b/bin/varnishd/cache/cache_vcl.c
@@ -469,6 +469,8 @@ vdire_iter(struct cli *cli, const char *pat, const struct vcl *vcl,
 
        vdire_start_iter(vdire);
        VTAILQ_FOREACH(vdir, &vdire->directors, directors_list) {
+               if (vdir->refcnt == 0)
+                       continue;
                if (pat != NULL && fnmatch(pat, vdir->cli_name, 0))
                        continue;
                found++;

this is done only for vdire_iter() and not vcl_BackendEvent(), because vcldir_retire() relies on prior temperature events being delivered.

ASSERT_CLI()

... is added. I expect this to be revisited at some point.

Deferred call to vcldir_retire():

I have actually written the code and reflected on it: I am not going to commit this change because it will open another can of racing worms. In particular, when we cool a VCL and unload a vmod, we need to be sure that all objects using code from that vmod are gone. Delaying the actual destruction adds further complications, and the goal here is to not introduce additional complexity, but stabilize the code by handling the complications already introduced with reference counting.

On the contrary, pondering this question, I have added this change out of prudence to ensure that before the final iteration over the directors list, we have finished all retirements:

diff --git a/bin/varnishd/cache/cache_vcl.c b/bin/varnishd/cache/cache_vcl.c
index aca2b5cab..2140bc664 100644
--- a/bin/varnishd/cache/cache_vcl.c
+++ b/bin/varnishd/cache/cache_vcl.c
@@ -585,6 +585,14 @@ vcl_KillBackends(const struct vcl *vcl)
        vdire = vcl->vdire;
        CHECK_OBJ_NOTNULL(vdire, VDIRE_MAGIC);
 
+       /*
+        * ensure all retirement has finished (vdire_start_iter waits for it)
+        */
+       vdire_start_iter(vdire);
+       vdire_end_iter(vdire);
+       AZ(vdire->iterating);
+       assert(VTAILQ_EMPTY(&vdire->resigning));
+
        /*
         * Unlocked and sidelining vdire because no further directors can be added, and the
         * remaining ones need to be able to remove themselves.

@nigoroll nigoroll requested a review from walid-git January 15, 2025 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a=feedback please a=need bugwash a=RunByUPLEX This PR is being run by UPLEX in production
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants