Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/graph: Escape labels to support double quotes #689

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions internal/graph/dotgraph.go
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@ func (b *builder) addNodelets(node *Node, nodeID int) bool {
continue
}
weight := b.config.FormatValue(w)
nodelets += fmt.Sprintf(`N%d_%d [label = "%s" id="N%d_%d" fontsize=8 shape=box3d tooltip="%s"]`+"\n", nodeID, i, t.Name, nodeID, i, weight)
nodelets += fmt.Sprintf(`N%d_%d [label = "%s" id="N%d_%d" fontsize=8 shape=box3d tooltip="%s"]`+"\n", nodeID, i, escapeForDotCentered(t.Name), nodeID, i, weight)
nodelets += fmt.Sprintf(`N%d -> N%d_%d [label=" %s" weight=100 tooltip="%s" labeltooltip="%s"]`+"\n", nodeID, nodeID, i, weight, weight, weight)
if nts := lnts[t.Name]; nts != nil {
nodelets += b.numericNodelets(nts, maxNodelets, flatTags, fmt.Sprintf(`N%d_%d`, nodeID, i))
Expand All @@ -274,7 +274,7 @@ func (b *builder) numericNodelets(nts []*Tag, maxNumNodelets int, flatTags bool,
}
if w != 0 {
weight := b.config.FormatValue(w)
nodelets += fmt.Sprintf(`N%s_%d [label = "%s" id="N%s_%d" fontsize=8 shape=box3d tooltip="%s"]`+"\n", source, j, t.Name, source, j, weight)
nodelets += fmt.Sprintf(`N%s_%d [label = "%s" id="N%s_%d" fontsize=8 shape=box3d tooltip="%s"]`+"\n", source, j, escapeForDotCentered(t.Name), source, j, weight)
nodelets += fmt.Sprintf(`%s -> N%s_%d [label=" %s" weight=100 tooltip="%s" labeltooltip="%s"%s]`+"\n", source, source, j, weight, weight, weight, attr)
}
}
Expand Down Expand Up @@ -483,9 +483,16 @@ func escapeAllForDot(in []string) []string {
return out
}

// escapeForDot escapes double quotes and backslashes, and replaces Graphviz's
// "center" character (\n) with a left-justified character.
// escapeForDot escapes double quotes and backslashes, and replaces newlines
// with a left-justified escape (\l).
// See https://graphviz.org/docs/attr-types/escString/ for more info.
func escapeForDot(str string) string {
return strings.ReplaceAll(strings.ReplaceAll(strings.ReplaceAll(str, `\`, `\\`), `"`, `\"`), "\n", `\l`)
}

// escapeForDotCentered escapes double quotes and backslashes, and replaces
// newlines with Graphviz's center escape (\n).
// See https://graphviz.org/docs/attr-types/escString/ for more info.
func escapeForDotCentered(str string) string {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something but quoting myself from #683:

I'd also expect any non-printable characters like newlines to be rendered as their C escaped version, so that the tag value is always one line when rendered. Same for function names. The escaping was initially added in #557 for graph legend labels that come from the profile comments field and for that we did support newlines to render newlines from comments as newlines since some comments intentionally use that kind of formatting. I think for function names and tag values it should be different and we shouldn't allow any formatting sequences in those strings.

So for function names and tag values I'd expect the newline to be escaped into a printable character, not rendered as either (left or center) actual newline character...

Copy link
Contributor Author

@evanj evanj Mar 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The newline handling is confusing, and there could be a better place to put it. However, I believe this version produces the desired output. Here is an example with my Go test output that currently fails to render with pprof, which shows the newlines in the key/value pairs being displayed as \n:

Screen Shot 2022-03-22 at 9 45 01 AM

I tried to change the existing logic as little as possible. My logic was the following:

The joinLabels function combines multiple key/value pairs, and returns a string as it should be displayed to a human, without any dot/graphviz details. It replaces newline characters with a \n sequence, since we want a key/value pair to be on a single line. However, it then combines the multiple key/value pairs with newlines, since we want to display separate key/value pairs on separate lines. As a result, the new lines that remain should be "visible." The TestJoinLabels attempts to demonstrate this.

Next, escapeForDotCentered must convert this "human string" into a version that can be embedded into Graphviz's "escString". To do that, it must escape backslashes, escape quotes, and replace newline characters with \n. It turns out: I can remove the newline escaping from this function, since "centered text" in graphviz's default. However, this will make this function different from escapeForDot. Is that desirable?

I totally forgot about other non-printable characters though, so thanks! I will add that to joinLabels. My logic is if we were to us a different graph rendering tool, we would still want this escaping, so it should not be "dot specific".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest we have a single escapeForDot() function with a bool argument like escapeNewlines and set that argument to true when escaping comments and to false when escaping label values.

The joinLabels function should only join lines with newlines (actual ones), it should not do any escaping to the joined values itself, it will assume the values have been escaped as necessary already.

return strings.ReplaceAll(strings.ReplaceAll(strings.ReplaceAll(str, `\`, `\\`), `"`, `\"`), "\n", `\n`)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could pull the common parts of these two functions out into a single shared function, if that seems better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing \n with \n seems a no-op.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original escapeForDot function used the backticks version for \l, so I just modified it. The quote characters are different: "\n" is an interpreted string, so is a single byte newline. The backticks version is a raw string, so it the two byte sequence '\\' + 'n'.

}
23 changes: 23 additions & 0 deletions internal/graph/dotgraph_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,29 @@ func TestComposeWithNamesThatNeedEscaping(t *testing.T) {
compareGraphs(t, buf.Bytes(), "compose7.dot")
}

func TestComposeWithTagsThatNeedEscaping(t *testing.T) {
g := baseGraph()
a, c := baseAttrsAndConfig()
g.Nodes[0].LabelTags["a"] = &Tag{
Name: `label"quote"` + "\nline2",
Cum: 10,
Flat: 10,
}
g.Nodes[0].NumericTags[""] = TagMap{
"b": &Tag{
Name: `numeric"quote"` + "\nline2",
Cum: 20,
Flat: 20,
Unit: "ms",
},
}

var buf bytes.Buffer
ComposeDot(&buf, g, a, c)

compareGraphs(t, buf.Bytes(), "compose8.dot")
}

func TestComposeWithCommentsWithNewlines(t *testing.T) {
g := baseGraph()
a, c := baseAttrsAndConfig()
Expand Down
8 changes: 6 additions & 2 deletions internal/graph/graph.go
Original file line number Diff line number Diff line change
Expand Up @@ -519,6 +519,8 @@ func (g *Graph) TrimTree(kept NodePtrSet) {
g.RemoveRedundantEdges()
}

// joinLabels returns the labels as a string. Newlines in the labels are
// replaced with "\n". Separate labels are joined with newlines.
func joinLabels(s *profile.Sample) string {
if len(s.Label) == 0 {
return ""
Expand All @@ -527,11 +529,13 @@ func joinLabels(s *profile.Sample) string {
var labels []string
for key, vals := range s.Label {
for _, v := range vals {
labels = append(labels, key+":"+v)
joined := key + ":" + v
escaped := strings.ReplaceAll(joined, "\n", `\n`)
labels = append(labels, escaped)
}
}
sort.Strings(labels)
return strings.Join(labels, `\n`)
return strings.Join(labels, "\n")
}

// isNegative returns true if the node is considered as "negative" for the
Expand Down
15 changes: 15 additions & 0 deletions internal/graph/graph_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -531,3 +531,18 @@ func TestShortenFunctionName(t *testing.T) {
}
}
}

func TestJoinLabels(t *testing.T) {
input := &profile.Sample{
Label: map[string][]string{
"key1": {"v1", "v2"},
// value with an embedded newline: is escaped to \n
"key2": {"value line1\nline2"},
},
}
const expected = "key1:v1\nkey1:v2\nkey2:value line1\\nline2"
output := joinLabels(input)
if output != expected {
t.Errorf("output=%#v != expected=%#v", output, expected)
}
}
11 changes: 11 additions & 0 deletions internal/graph/testdata/compose8.dot
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
digraph "testtitle" {
node [style=filled fillcolor="#f8f8f8"]
subgraph cluster_L { "label1" [shape=box fontsize=16 label="label1\llabel2\llabel3: \"foo\"\l" tooltip="testtitle"] }
N1 [label="src\n10 (10.00%)\nof 25 (25.00%)" id="node1" fontsize=22 shape=box tooltip="src (25)" color="#b23c00" fillcolor="#edddd5"]
N1_0 [label = "label\"quote\"\nline2" id="N1_0" fontsize=8 shape=box3d tooltip="10"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in another comment

So for function names and tag values I'd expect the newline to be escaped into a printable character, not rendered as either (left or center) actual newline character

and as far as I can tell, we render the newline as actual newline here:

image

Unlike we do for comments, for label values we should escape the newlines here to render them as "\n".

N1 -> N1_0 [label=" 10" weight=100 tooltip="10" labeltooltip="10"]
NN1_0 [label = "numeric\"quote\"\nline2" id="NN1_0" fontsize=8 shape=box3d tooltip="20"]
N1 -> NN1_0 [label=" 20" weight=100 tooltip="20" labeltooltip="20"]
N2 [label="dest\n15 (15.00%)\nof 25 (25.00%)" id="node2" fontsize=24 shape=box tooltip="dest (25)" color="#b23c00" fillcolor="#edddd5"]
N1 -> N2 [label=" 10" weight=11 color="#b28559" tooltip="src -> dest (10)" labeltooltip="src -> dest (10)" minlen=2]
}