Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/graph: Escape labels to support double quotes #689

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

evanj
Copy link
Contributor

@evanj evanj commented Mar 15, 2022

Fixes syntax errors when trying to render pprof profiles that have
double quotes in tags. These can be created with Go's pprof labels
feature, for example with:

Fixes syntax errors when trying to render pprof profiles that have
double quotes in tags. These can be created with Go's pprof labels
feature, for example with:

pprof.Labels("key", "label "double quote"\nline two")

Trying to display a graph generated with this lable will fail:

Error: : syntax error in line 5 near 'quote'

The double quote (") was never escaped in the label strings. Add
a new escaping function that replaces newlines with centered lines
(\n) because the existing one replaces newline with left-justified
lines (\l).

@evanj
Copy link
Contributor Author

evanj commented Mar 15, 2022

This is a second attempt at #683. I started this as a new pull request because it takes a slightly different approach, so I was personally finding the commit history confusing instead of helpful.

// newlines with Graphviz's center escape (\n).
// See https://graphviz.org/docs/attr-types/escString/ for more info.
func escapeForDotCentered(str string) string {
return strings.ReplaceAll(strings.ReplaceAll(strings.ReplaceAll(str, `\`, `\\`), `"`, `\"`), "\n", `\n`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could pull the common parts of these two functions out into a single shared function, if that seems better.

Fixes syntax errors when trying to render pprof profiles that have
double quotes in tags. These can be created with Go's pprof labels
feature, for example with:

Fixes syntax errors when trying to render pprof profiles that have
double quotes in tags. These can be created with Go's pprof labels
feature, for example with:

pprof.Labels("key", "label \"double quote\"\nline two")

Trying to display a graph generated with this lable will fail:

Error: <stdin>: syntax error in line 5 near 'quote'

The double quote (") was never escaped in the label strings. Add
a new escaping function that replaces newlines with centered lines
(\n) because the existing one replaces newline with left-justified
lines (\l).
@evanj evanj force-pushed the evan.jones/label-escaping branch from 317aa1e to 48de501 Compare March 15, 2022 15:37
@codecov-commenter
Copy link

codecov-commenter commented Mar 15, 2022

Codecov Report

Merging #689 (34b8741) into main (c488b8f) will increase coverage by 0.07%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main     #689      +/-   ##
==========================================
+ Coverage   63.70%   63.78%   +0.07%     
==========================================
  Files          41       41              
  Lines        6489     6492       +3     
==========================================
+ Hits         4134     4141       +7     
+ Misses       1913     1908       -5     
- Partials      442      443       +1     
Impacted Files Coverage Δ
...github.com/google/pprof/internal/graph/dotgraph.go 90.32% <0.00%> (+0.04%) ⬆️
...rc/github.com/google/pprof/internal/graph/graph.go 29.10% <0.00%> (+1.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c488b8f...34b8741. Read the comment docs.

// newlines with Graphviz's center escape (\n).
// See https://graphviz.org/docs/attr-types/escString/ for more info.
func escapeForDotCentered(str string) string {
return strings.ReplaceAll(strings.ReplaceAll(strings.ReplaceAll(str, `\`, `\\`), `"`, `\"`), "\n", `\n`)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing \n with \n seems a no-op.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original escapeForDot function used the backticks version for \l, so I just modified it. The quote characters are different: "\n" is an interpreted string, so is a single byte newline. The backticks version is a raw string, so it the two byte sequence '\\' + 'n'.

// escapeForDotCentered escapes double quotes and backslashes, and replaces
// newlines with Graphviz's center escape (\n).
// See https://graphviz.org/docs/attr-types/escString/ for more info.
func escapeForDotCentered(str string) string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something but quoting myself from #683:

I'd also expect any non-printable characters like newlines to be rendered as their C escaped version, so that the tag value is always one line when rendered. Same for function names. The escaping was initially added in #557 for graph legend labels that come from the profile comments field and for that we did support newlines to render newlines from comments as newlines since some comments intentionally use that kind of formatting. I think for function names and tag values it should be different and we shouldn't allow any formatting sequences in those strings.

So for function names and tag values I'd expect the newline to be escaped into a printable character, not rendered as either (left or center) actual newline character...

Copy link
Contributor Author

@evanj evanj Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newline handling is confusing, and there could be a better place to put it. However, I believe this version produces the desired output. Here is an example with my Go test output that currently fails to render with pprof, which shows the newlines in the key/value pairs being displayed as \n:

Screen Shot 2022-03-22 at 9 45 01 AM

I tried to change the existing logic as little as possible. My logic was the following:

The joinLabels function combines multiple key/value pairs, and returns a string as it should be displayed to a human, without any dot/graphviz details. It replaces newline characters with a \n sequence, since we want a key/value pair to be on a single line. However, it then combines the multiple key/value pairs with newlines, since we want to display separate key/value pairs on separate lines. As a result, the new lines that remain should be "visible." The TestJoinLabels attempts to demonstrate this.

Next, escapeForDotCentered must convert this "human string" into a version that can be embedded into Graphviz's "escString". To do that, it must escape backslashes, escape quotes, and replace newline characters with \n. It turns out: I can remove the newline escaping from this function, since "centered text" in graphviz's default. However, this will make this function different from escapeForDot. Is that desirable?

I totally forgot about other non-printable characters though, so thanks! I will add that to joinLabels. My logic is if we were to us a different graph rendering tool, we would still want this escaping, so it should not be "dot specific".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest we have a single escapeForDot() function with a bool argument like escapeNewlines and set that argument to true when escaping comments and to false when escaping label values.

The joinLabels function should only join lines with newlines (actual ones), it should not do any escaping to the joined values itself, it will assume the values have been escaped as necessary already.

@evanj
Copy link
Contributor Author

evanj commented Jul 7, 2022

I believe this PR should fix the bug that was just reported to the Go project: golang/go#53703 . The outstanding issue that I need to get back to is escaping non-printable characters, and adding a test for that case. I may not have time to get to that immediately. I think this PR doesn't make the handling of non-printable characters worse than it is today, but it would be nice to make this a "complete" fix. I will attempt to return to this change in the next couple of weeks.

node [style=filled fillcolor="#f8f8f8"]
subgraph cluster_L { "label1" [shape=box fontsize=16 label="label1\llabel2\llabel3: \"foo\"\l" tooltip="testtitle"] }
N1 [label="src\n10 (10.00%)\nof 25 (25.00%)" id="node1" fontsize=22 shape=box tooltip="src (25)" color="#b23c00" fillcolor="#edddd5"]
N1_0 [label = "label\"quote\"\nline2" id="N1_0" fontsize=8 shape=box3d tooltip="10"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in another comment

So for function names and tag values I'd expect the newline to be escaped into a printable character, not rendered as either (left or center) actual newline character

and as far as I can tell, we render the newline as actual newline here:

image

Unlike we do for comments, for label values we should escape the newlines here to render them as "\n".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants