-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
several caprevoke optimizations #2195
base: dev
Are you sure you want to change the base?
Conversation
SYSCTL_COUNTER_U64(_vm_stats_cheri_revoke, OID_AUTO, skip_obj_no_hascap, CTLFLAG_RD, | ||
&cheri_skip_obj_no_hascap, | ||
"Virtual pages skipped in VM objects with no capabilities"); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reason to prefer this over the caprevoke stats infrastructure? (Though admittedly I'd not be sad to see the latter get ripped out.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly that I wanted to have a global view of the counter value, and sysctl is more convenient for that. The existing stats infrastructure collects per-process stats and I believe is limited to printing them when the process exits. The right direction might be a hybrid scheme wherein we maintain per-process (really, per-vmspace) and global counters, and use the former to update the latter after each scan.
goto fini; | ||
if ((entry->max_protection & VM_PROT_READ_CAP) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message should probably say "have PROT_READ_CAP clear"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but the commit log is not very clear. I meant that the existing check (max_prot & PROT_READ_CAP) == 0
does not exclude, e.g., mappings of executable ELF file segments, so without this change we end up scanning those pages unnecessarily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've given this all a light skim and it generally seems good. The second commit message makes sense now that I understand what's going on, but spelling out more from the big block comment is probably worthwhile.
sys/sys/user.h
Outdated
@@ -576,7 +576,8 @@ struct kinfo_vmentry { | |||
} kve_type_spec; | |||
uint64_t kve_vn_rdev; /* Device id if device. */ | |||
uint64_t kve_reservation; /* Map reservation */ | |||
int _kve_ispare[6]; /* Space for more stuff. */ | |||
int kve_max_protection; /* Max protection bitmask. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we should upstream kve_max_protection and swap them around in CheriBSD before the next release?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If, when scanning, we reach a VA that is unmapped and corresponds to a non-resident page, and the VM object is swap-backed and has no swap blocks assigned, we skip to the next resident page. Extend this optimization to the case where the object has swap blocks assigned: the in-memory tracking structure for a swap block includes a bitmap of all the capability tags, so when skipping we can search for the closer of - the next resident page - the next page backed by a swap block that has at least one capability tag.
This skips over vnode mappings that have PROT_READ_CAP set in max_prot and thus avoids a lot of needless scanning. Add some counters so that we can see the relative effectiveness of different checks in reducing the number of scanned virtual pages.
This avoids some needless work in cases where a short-lived process triggers async revocation.
f4b1b61
to
f0dce4c
Compare
I dropped the procstat changes since those are coming from upstream and aren't critical to the rest of the PR. |
@@ -192,15 +199,14 @@ vm_cheri_revoke_kproc(void *arg __unused) | |||
/* | |||
* Do the actual revocation pass. | |||
*/ | |||
error = vm_cheri_revoke_pass_locked(&arc->cookie); | |||
error = vm_cheri_revoke_pass_locked(arc->vm, &arc->cookie); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks a bit weird to use this after it was seemingly free'd above. It might be clearer to the reader with a comment and a local vm
temporary like so:
vmspace_switch_aio(&arc->vm);
vm = arc->vm;
/*
* Drop arc's reference on the vmspace. The scanner kproc holds a reference via
* its p_vmspace pointer while the scanning the address space. Dropping the
* reference here permits the scan to return early if target process exits early
* leaving the scanner kproc's reference as the only reference.
*/
vmspace_free(arc->vm);
The procstat-related patches should perhaps be upstreamed as well, though
procstat vm
doesn't have a verbose mode in vanilla FreeBSD.