-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathperformance.html
457 lines (421 loc) · 17.9 KB
/
performance.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
<html>
<!--Need to find a way to insert the current date-->
<head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<title>Performance analysis work log</title>
<style type="text/css">
p {
margin-top: 0.6em;
margin-bottom: 0.6em;
}
p.bibl {
margin-top: 0.6em;
margin-bottom: 0.6em;
margin-left: 2em;
text-indent: -2em;
}
.Real-P {
margin-top: 0.6em;
margin-bottom: 0.6em;
}
div.sp {
margin-top: 0.6em;
margin-bottom: 0.6em;
}
div.address {
margin-top: 0.6em;
margin-bottom: 0.6em;
margin-left: 2em;
}
div.note {
margin-top: 1em;
margin-left: 2em;
}
pre {
font-family: monospace;
margin-left: 2em
}
a:hover {
background: #CCF
}
td.no {
background: #CCF
}
.author {
font-size: x-large
}
.bio {
font-size: small;
font-style: italic
}
span.decision {
}
span.result {
font-weight: bold
}
span.aye {
display: block;
margin-left: 2em;
}
span.nay {
display: block;
margin-left: 2em;
}
span.abstain {
display: block;
margin-left: 2em;
}
span.result {
display: block;
margin-left: 2em;
font-weight: bold
}
span.speaker {
display: inline;
}
span.typename {
display: inline;
font-family: monospace;
}
a.selflink {
text-decoration: none;
color: initial;
}
div.scrap {
margin-top: 0.5em;
background-color: #CFEFCF;
background-color: #E7F7E7;
padding: 0.6em;
margin-bottom: 0.5em;
}
pre.scrapbody {
margin-left: 0.5em;
margin-bottom: 0.5em;
}
span.scrapcontinuations {
font-size:smaller;
}
span.scrapinbound {
font-size:smaller;
}
span.scrapref {
display: inline-block;
text-indent: -0.3em;
font-family: New Times Roman, serif, Lucida Sans Unicode;
}
em.scrapptr {
font-family: New Times Roman, serif, Lucida Sans Unicode;
}
dl.desclist {
list-style-type: none;
}
dl.desclist > dt {
display: run-in;
padding-right: 0.5em;
/*
*/
}
dl.desclist > dd {
text-indent: -1em;
margin-left: 1em;
}
ul.desclist {
list-style-type: none;
}
ul.desclist > li {
margin-left: 2em;
text-indent: -2em;
}
div.epigraph .Real-P {margin-top: 0em; margin-bottom: 0em;}
</style><link xmlns="http://www.w3.org/1999/xhtml" rel="stylesheet" href="../lib/local.css"></link></head><body><div class="doc">
<h1>Performance analysis work log</h1>
<h1>for Aparecium</h1>
<h3>C. M. Sperberg-McQueen</h3>
<h3>20 August 2022, last revised 20 August 2022</h3>
<hr><a name="toc"></a>
<ul><!-- and a 1! -->
<li>1. <a href="#log">Work log</a><ul>
<!-- and a 2! -->
<li>1.1. <a href="#log-20220709">9 July 2022</a></li>
<li>1.2. <a href="#log-20220820">20 August 2022</a></li>
</ul></li><li>2. <a href="#test-plan">Test suite and test plans</a></li><li>3. <a href="#static-analysis">Static analysis</a></li><li>4. <a href="#misc-conjectures">Miscellaneous queries and conjectures</a></li></ul>
<hr>
<div class="Real-P">This document records work in performance improvements for
Aparecium, an Invisible-XML processor written in XQuery.</div>
<div class="Real-P">Part of it is a work log organized chronologically; parts of it
are non-chronological descriptions of ideas to try.</div>
<div class="div">
<div class="quicknav"><span class="arrow"> </span><span class="arrow"><a href="#toc">⏶</a></span><span class="arrow"><a href="#test-plan">⏵</a></span></div><h2><a class="selflink" name="log" id="log" href="#log">1. </a>Work log</h2><ul><li>1.1. <a href="#log-20220709">9 July 2022</a></li><li>1.2. <a href="#log-20220820">20 August 2022</a></li></ul>
<div class="div">
<div class="quicknav"><span class="arrow"> </span><span class="arrow"><a href="#log">⏶</a></span><span class="arrow"><a href="#log-20220820">⏵</a></span></div><h3><a class="selflink" name="log-20220709" id="log-20220709" href="#log-20220709">1.1. </a>9 July 2022</h3>
<div class="Real-P">Inspired perhaps by the discussion of big-O analysis in
Meyer's <em>Touch of class</em>, tried to see if I could
make any headway on a static analysis of Aparecium. It is
doubtless an amateurish job, since I've never done it before,
and it seems to hover uneasily between an attempt to
characterize performance in some detail, at least potentially,
and the ruthlessly simple “linear, quadratic, or what?”
focus Meyer exhibits.</div>
<div class="Real-P">But I found it interesting enough to want to continue.</div>
</div>
<div class="div">
<div class="quicknav"><span class="arrow"><a href="#log-20220709">⏴</a></span><span class="arrow"><a href="#log">⏶</a></span><span class="arrow"> </span></div><h3><a class="selflink" name="log-20220820" id="log-20220820" href="#log-20220820">1.2. </a>20 August 2022</h3>
<div class="Real-P">After Balisage, I decided to try to make improvements to
performance in Aparecium my rest-of-summer project, to see how
far I could get. At the moment, it looks very grim -- not much
visible progress at all, and not much invisible progress,
either.</div>
<div class="Real-P">I have four tasks at the moment, three of which can be worked
on in parallel (or in rotation):
<ul><li>
<div class="Real-P"><b>static analysis</b> of the code to try to
understand where the hot spots are likely to be</div>
<div class="Real-P">See <a href="#static-analysis">below</a>.</div>
</li>
<li>
<div class="Real-P">support for <b>additional XQuery engines</b>, to
make it easier to avoid confusing optimizations that work
across the board (like doing less work) from those that are
implementation-dependent</div>
<div class="Real-P">My current targets are eXist-db and Saxon HE.</div>
<div class="Real-P">There are good reasons to want to support them in any
case, but the reason to work <em>now</em> on supporting
them is to make the performance work better. I don't want
to delay the actual work on performance indefinitely,
though, so I plan to cap my time on them to 20 and 10 hours,
respectively.</div>
<div class="Real-P">So far, I think I have spent 8h billable to eXist-db
(including filing two bug reports, which slows things down a
good deal), and 2h on HE (mostly an inquiry on the Saxon
forum about how to guard processor-specific code, which
elicited the news that Saxon 10 and 11 make HE support
higher-order functions).</div>
<div class="Real-P">As I proceed, I should try to follow through on the idea
of (a) wrapping each bit of processor-dependence in a
function and (b) putting those functions in a separate
utilities module.</div>
</li>
<li>
<div class="Real-P">development and measurement of a set of
<b>performance tests</b> (see <a href="#test-plan">below</a>).</div>
</li>
<li>
<div class="Real-P">figuring out <b>how to measure</b> performance
for engines other than BaseX</div>
<div class="Real-P">When I can run from bash, I can use the Bash
<i>time</i> command, but I cannot run Saxon PE and
HE from the shell, only from Oxygen. So it may have to be
file creation times (but that won't work with Saxon HE).</div>
</li>
</ul>
</div>
<div class="Real-P">Today I am trying to spend some time on each of the first
three (analysis, new engines, test suite); my work on analysis today
has consisted so far in creating this document, beginning to
fill it in, and inserting the work done in July into the
section on <a href="#static-analysis">Static analysis</a>.</div>
</div>
</div>
<div class="div">
<div class="quicknav"><span class="arrow"><a href="#log">⏴</a></span><span class="arrow"><a href="#toc">⏶</a></span><span class="arrow"><a href="#static-analysis">⏵</a></span></div><h2><a class="selflink" name="test-plan" id="test-plan" href="#test-plan">2. </a>Test suite and test plans</h2>
<div class="Real-P">The test suite has three branches:
<ul><li>
<div class="Real-P">The tests in <tt>ixml/tests/performance/*</tt>, which
are artificially simple tests (to try to keep them easy to
understand and analyse) with mechanically constructed inputs,
each input twice the size of its predecessor.</div>
<div class="Real-P">From the results, it seems clear that there is a quadratic
term in Aparecium performance even on deterministic
grammars.</div>
</li>
<li>
<div class="Real-P">A selection of tests from the existing ixml test suite,
some fast and some slow. More specifically, a selection of
five to ten tests at each of various speeds (tens of
milliseconds, hundreds, thousands, ...), where possible with
some tests spending more of their time in the grammar and some
more of their time in the instance.</div>
</li>
<li>
<div class="Real-P">A seletion of ‘realistic’ tests, with
grammars and inputs from projects unrelated to invisible XML
and Aparecium.
<ul><li>XPath</li>
<li>Oberon</li>
<li>vCards</li>
<li>Ariadne</li>
<li>first-order predicate calculus</li>
</ul></div>
</li>
</ul>
</div>
<div class="Real-P">For each grammar and test case, measure:
<ul><li>size in characters</li>
<li>size in AST nodes (count nodes, elements, attributes; I
think |nodes| = |elements| + |attributes| + |comment
elements|)</li>
<li>size in raw parse-tree nodes (to measure this, change all
marks in the governing grammar to ^)</li>
<li>size of PFG (elements, nodes, rules)</li>
<li>number of Earley items generated</li>
</ul>
</div>
</div>
<div class="div">
<div class="quicknav"><span class="arrow"><a href="#test-plan">⏴</a></span><span class="arrow"><a href="#toc">⏶</a></span><span class="arrow"><a href="#misc-conjectures">⏵</a></span></div><h2><a class="selflink" name="static-analysis" id="static-analysis" href="#static-analysis">3. </a>Static analysis</h2>
<div class="Real-P">As a first attempt, let's just try to walk through one possible
execution. To execute <tt>ap:parse-resource($I, $G)</tt>,
what is the cost <i>k</i>?</div>
<div class="Real-P">[Note, 9 July. This seems to be a plausible way to start. It
illustrates just how many layers there are in the onion, and it
may start getting difficult just about where I have left off for
now. But I think this is a start.]</div>
<ol><li><a name="parse-resource" id="parse-resource"></a>
<div class="Real-P">k( parse-resource($I + $G) ) =
<br>
<br>k(unparsed-text($I))
<br>+ k(unparsed-text($G))
<br>+ k(parse-string($I2, $G2)) // <a href="#parse-string"><!--* have no n *-->2</a></div>
<div class="Real-P"><b>Conjecture:</b> unparsed-text() is O(n) in
bytecount of result. (Test by ad-hoc testing.) And in any
case I can't do anything about its speed.</div>
<div class="Real-P"><b>Conjecture:</b> unparsed-text() is negligible
compared to parse-string, in all strata. (Test in BaseX by
timing calls, selectively.)</div>
</li>
<li><a name="parse-string" id="parse-string"></a>
<div class="Real-P">k(parse-string($I, $G)) =
<br>
<br>k(compile-grammar-from-string($G)) // <a href="#compile-grammar-from-string"><!--* have no n *-->3</a>
<br>+ k(parse-string-with-compiled-grammar($I, $G2)) // <a href="#parse-string-with-compiled-grammar"><!--* have no n *-->4</a>
</div>
<div class="Real-P"><b>Query:</b> What is relative cost of these?</div>
</li>
<li><a name="compile-grammar-from-string" id="compile-grammar-from-string"></a>
<div class="Real-P">k(compile-grammar-from-string($G)) =
<br>
<br>k(parse-grammar-from-string($G)) // <a href="#parse-grammar-from-string"><!--* have no n *-->5</a>
<br>+ k(gluschkov($G2)) // <a href="#gluschkov"><!--* have no n *-->6</a>
</div>
<div class="Real-P"><b>Query:</b> What is relative cost of these?</div>
<div class="Real-P"><b>Conjecture:</b> parsing is more expensive than compiling.</div>
</li>
<li><a name="parse-string-with-compiled-grammar" id="parse-string-with-compiled-grammar"></a>
<div class="Real-P">k(parse-string-with-compiled-grammar($I, $CG))=
<br>
<br>k(grammar-ok($CG)) // <a href="#grammar-ok"><!--* have no n *-->7</a>
<br>+ k(earley:parse($I, $CG)) // <a href="#earley-parse"><!--* have no n *-->8</a>
</div>
<div class="Real-P"><b>Conjecture:</b> grammar-ok() is O(n) in nodes of
grammar, and negligible compared to earley:parse().</div>
<div class="Real-P"><b>N.B.:</b> earley-parse() should be O(n) in size
of grammar (in nodes) plus size of input (in characters), when
the grammar is deterministic, and worst-case cubic with a
maximally non-deterministic grammar. It's currently quadratic
when the grammar is deterministic. Why?</div>
<div class="Real-P"><b>To do:</b> find a maximally non-deterministic
grammar in Grune and Jacobs, and check existing a-star and
even/odds grammars for determinism.</div>
</li>
<li><a name="parse-grammar-from-string" id="parse-grammar-from-string"></a>
<div class="Real-P">k(parse-grammar-from-string($G))
<br>
<br>k(doc($ap:ixml.gl.xml)) // not further analysed
<br>+ k(parse-string-with-compiled-grammar($G, $ixml.gl)) // <a href="#parse-string-with-compiled-grammar"><!--* have no n *-->4</a>
</div>
<div class="Real-P"><b>Conjecture</b> that doc() is linear in bytes of
the XML document being loaded. Or alternatively, linear in
nodes of the document. But it is in any case out of my hands.
Worth a few minutes to try to test the conjecture, but not
more.</div>
</li>
<li><a name="gluschkov" id="gluschkov"></a>
<div class="Real-P">k(gluschkov($G2)) =
<br>
</div>
<div class="Real-P"><b>Conjecture:</b> this is a depth-first traversal
and the cost is O(n) for n = number of symbols in G, ≅ number
of XML elements in $G//* except $G//comment.</div>
<div class="Real-P">
</div>
</li>
<li><a name="grammar-ok" id="grammar-ok"></a>
<div class="Real-P">k(grammar-ok($CG))
</div>
<div class="Real-P"><b>Conjecture:</b> this involves ten searches
through the entire grammar for elements of particular kinds.
It will be O(n) for n = number of symbols in G, ≅ number of
XML elements in $G//* excluding comments and
namespace-qualified elements).</div>
</li>
<li><a name="earley-parse" id="earley-parse"></a>
<div class="Real-P">k(earley:parse($I, $CG)) = ???
<br>k(er:recognizeX($I, $CG) <a href="#er.recognizeX"><span class="red">???</span></a>
<br>+ k(epi:parse-forest-grammar(completions, closure, input)) // <a href="#epi.parse-forest-grammar"><!--* have no n *-->10</a>
<br>+ k(checking PFG insertions) // linear in |PFG| + |insertions|
<br>+ k(epi:tree-from-pfg(pfg)) // <a href="#epi.tree-from-pfg"><!--* have no n *-->11</a>
<br>+ k(checking AST for non-XML names) // linear in |AST|
<br>+ k(version check) // O(1)
</div>
<div class="Real-P"><b>Conjecture:</b> recognition takes over half the
time (50-70%), construction of PFG and extraction of tree the
rest of the time, other items are negligible.</div>
<div class="Real-P"><b>Conjecture:</b> Construction of PFG takes more
than twice as long as extraction of AST. (Rationale: Element
construction dominates the time [<i>check this how?</i>].
For every element and attribute node in the AST, two elements
[rule and reference to it] are constructed in the PFG.)</div>
<div class="Real-P"><b>Conjecture:</b> Making the PFG a map will speed
up construction of PFG (by eliminating element
construction).</div>
</li>
<li><a name="er.recognize" id="er.recognize"></a>
<div class="Real-P">k(er:recognizeX($I, $CG) =
<br>
<br>k(initialization)
<br>+ k(ixi:earley-closure) // <a href="#earley-closure"><!--* have no n *-->12</a>
</div>
<div class="Real-P"><b>Conjecture:</b> earley-closure() takes all the
time.</div>
</li>
<li><a name="epi.parse-forest-grammar" id="epi.parse-forest-grammar"></a>
<div class="Real-P">k(epi:parse-forest-grammar(completions, closure, input)) =
<br>k(make-pfg-rules) // <a href="#make-pfg-rules"><!--* have no n *-->13</a></div>
<div class="Real-P"><b>Conjecture:</b> Time is dominated by (a) searches in
Earley set and (b) construction of elements.</div>
<div class="Real-P"><b>Query:</b> what is the relative proportion of
those two?</div>
<div class="Real-P"><b>Conjecture:</b> Earley searches will be improved
by making lookup faster. (Additional indexing?)</div>
<div class="Real-P"><b>Conjecture:</b> Construction will be improved
by making PFG be a map instead of an XML document.</div>
</li>
<li><a name="epi.tree-from-pfg" id="epi.tree-from-pfg"></a>
<div class="Real-P">k(epi:tree-from-pfg(pfg)) =
<br>O(n) in number of elements in pfg,
proportional to number of elements in AST
</div>
<div class="Real-P"><b>Conjecture:</b> Construction will be faster if
the PFG becomes a map, because search for relevant rules will
be O(1) instead of O(n).</div>
</li>
<li><a name="earley-closure" id="earley-closure"></a>
<div class="Real-P">k(ixi:earley-closure($EI, $I, $G)) =
<br>???
</div>
</li>
<li><a name="make-pfg-rules" id="make-pfg-rules"></a>
<div class="Real-P">k(epi:make-pfg-rules(work-queue, set, input, accumulator)) =
<br>???
</div>
</li>
</ol>
</div>
<div class="div">
<div class="quicknav"><span class="arrow"><a href="#static-analysis">⏴</a></span><span class="arrow"><a href="#toc">⏶</a></span><span class="arrow"> </span></div><h2><a class="selflink" name="misc-conjectures" id="misc-conjectures" href="#misc-conjectures">4. </a>Miscellaneous queries and conjectures</h2>
<div class="Real-P"><b>Conjecture:</b> current time for
parse-grammar-from-string() or the equivalent is roughly equal to
parsing input-grammar string against the vxml spec grammar.</div>
<div class="Real-P">If so, that means I can continue to compare the run time for
parsing grammars with Earley vs parsing with bespoke parser.</div>
</div>
</div></body></html>