[Perf] Use heuristics to avoid allocations in Sanitizer::str_till_eol #2563

vicsn · 2024-10-22T10:32:59Z

Motivation

Replaces #2517

Original PR message:

(transferred from ProvableHQ#6, and asking @d0cd for a review as suggested there)

The Sanitizer is used very prominently in our parsing functions, and it is also a source of many allocations, most of which are temporary and avoidable.

The potential perf improvements are quite large, and I've measured them both with a 15-minute run of a --dev node and using hyperfine on a small binary that parsed all the valid .aleo programs currently present in the snarkVM codebase.

dev node:

all allocs are down ~15%, of which almost all are temporary
in Program::from_str specifically, allocs are reduced by ~64%, of which temp allocs ~88%

parsing all .aleo programs using Program::from_str:

allocs are reduced by ~41%
growth reallocs are down ~70%
runtime is reduced by ~31%

Test Plan

CI run link

Signed-off-by: ljedrz <[email protected]>

…tizer2

d0cd

Overall, I'm in favor of this PR. It would be helpful to add test cases for the this specific parser, especially for when the heuristic is and isn't taken.

d0cd · 2024-10-24T00:48:31Z

console/network/environment/src/helpers/sanitizer.rs

+                if !contains_unsafe_chars {
+                    Ok((after, before))
+                } else {
+                    recognize(Self::till(value((), Sanitizer::parse_safe_char), Self::eol))(before)


Does this need to use a till operator until Self::eol since this is knows to be a single line string?

good point; I'm not sure, but since I'm not super savvy with nom, I preferred to leave it intact

d0cd · 2024-10-24T00:48:59Z

console/network/environment/src/helpers/sanitizer.rs

-            },
-        )(string)
+        // A heuristic approach is applied here in order to avoid
+        // costly parsing operations in the most common scenarios.


Nit. Can you explain the heuristic in the comment?

added to my TODO (as with the Program-related comment, I'd suggest to add it separately to avoid merge-related issues)

d0cd · 2024-10-24T00:51:12Z

console/network/environment/src/helpers/sanitizer.rs

+                let contains_unsafe_chars = !before.chars().all(is_char_supported);
+
+                if !contains_unsafe_chars {
+                    Ok((after, before))


Is the goal to avoid allocations in this method?

IIRC the allocations that are avoided are primarily:

the ones from verify

the ones from recognize

the ones from value

but most importantly, the alt-related ones from here and here

zkxuerb

LGTM

ljedrz · 2024-11-05T11:04:19Z

@vicsn I updated the original branch with 2 commits addressing the notes on extra docs and the redundant use of eol. Feel free to cherry-pick those, they should apply cleanly.

@d0cd as for extra tests, note that str_till_eol is only ever used in parse_comment, and there are already several applicable test cases for it, including ones that would utilize the faster approach. Do you feel like those are missing any variants?

d0cd · 2024-11-05T17:24:09Z

@ljedrz on initial scan, I don't see test case for a multi-line comment, multiple single line comments, multiple multi-line comments, and interleavings of the two. I'll leave it to your discretion, but I would at least recommend a multi-line comment and multiple single line comments.

I also saw that you use eoi in the update. Looks good to me, but I would also recommend explaining in a comment that this is valid because before is a single line string.

The reason I am being so insistent on comments is that these parsers are quite sensitive and not many people have a deep expertise in nom.

ljedrz · 2024-11-06T11:20:06Z

@vics I addressed the comments in 2 new commits:

ljedrz · 2024-11-12T18:15:30Z

While not a big deal (extra test cases, docs, and one optimization), this PR was still missing the 4 extra commits mentioned at the end. I can include them in a follow-up shortly.

vicsn · 2024-11-13T13:27:09Z

@ljedrz apologies I overlooked that I had to update my branch

ljedrz and others added 3 commits July 10, 2024 10:39

perf: allow Sanitizer::str_till_eol to omit allocations

3bbbd25

Signed-off-by: ljedrz <[email protected]>

Merge branch 'mainnet-staging' into perf/parsing_sanitizer2

831cdf6

Merge remote-tracking branch 'aleonet/staging' into perf/parsing_sani…

c9c83f9

…tizer2

vicsn mentioned this pull request Oct 22, 2024

[Perf] Use heuristics to avoid allocations in Sanitizer::str_till_eol #2517

Closed

d0cd reviewed Oct 24, 2024

View reviewed changes

zosorock requested review from evanmarshall, lukenewman, asharma13524, fulltimemike, zkxuerb, zklimaleo and a team October 26, 2024 02:35

zosorock added the enhancement New feature or request label Oct 26, 2024

gluax approved these changes Oct 29, 2024

View reviewed changes

Meshiest approved these changes Oct 29, 2024

View reviewed changes

zkxuerb approved these changes Nov 1, 2024

View reviewed changes

zosorock added the v1.1.4 Canary release v1.1.4 label Nov 12, 2024

zosorock merged commit 60a0aa5 into AleoNet:staging Nov 12, 2024
84 checks passed

ljedrz mentioned this pull request Nov 13, 2024

A few missing bits for str_to_eol #2571

Open

zosorock mentioned this pull request Nov 13, 2024

Canary release week 24.46 - v1.1.4 AleoNet/snarkOS#3436

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Perf] Use heuristics to avoid allocations in Sanitizer::str_till_eol #2563

[Perf] Use heuristics to avoid allocations in Sanitizer::str_till_eol #2563

vicsn commented Oct 22, 2024 •

edited

Loading

d0cd left a comment

d0cd Oct 24, 2024

ljedrz Oct 25, 2024

d0cd Oct 24, 2024

ljedrz Oct 25, 2024

d0cd Oct 24, 2024

ljedrz Oct 25, 2024

zkxuerb left a comment

ljedrz commented Nov 5, 2024

d0cd commented Nov 5, 2024 •

edited

Loading

ljedrz commented Nov 6, 2024

ljedrz commented Nov 12, 2024

vicsn commented Nov 13, 2024

[Perf] Use heuristics to avoid allocations in Sanitizer::str_till_eol #2563

[Perf] Use heuristics to avoid allocations in Sanitizer::str_till_eol #2563

Conversation

vicsn commented Oct 22, 2024 • edited Loading

Motivation

Test Plan

d0cd left a comment

Choose a reason for hiding this comment

d0cd Oct 24, 2024

Choose a reason for hiding this comment

ljedrz Oct 25, 2024

Choose a reason for hiding this comment

d0cd Oct 24, 2024

Choose a reason for hiding this comment

ljedrz Oct 25, 2024

Choose a reason for hiding this comment

d0cd Oct 24, 2024

Choose a reason for hiding this comment

ljedrz Oct 25, 2024

Choose a reason for hiding this comment

zkxuerb left a comment

Choose a reason for hiding this comment

ljedrz commented Nov 5, 2024

d0cd commented Nov 5, 2024 • edited Loading

ljedrz commented Nov 6, 2024

ljedrz commented Nov 12, 2024

vicsn commented Nov 13, 2024

vicsn commented Oct 22, 2024 •

edited

Loading

d0cd commented Nov 5, 2024 •

edited

Loading