Skip to content

Commit

Permalink
Merge pull request #717 from michaelhkay/Issue211-capturing-accumulators
Browse files Browse the repository at this point in the history
211: add capturing accumulators to XSLT
  • Loading branch information
ndw authored Sep 29, 2023
2 parents f61b05c + 6df49c1 commit 35a08e0
Show file tree
Hide file tree
Showing 3 changed files with 122 additions and 5 deletions.
3 changes: 3 additions & 0 deletions specifications/xslt-40/src/element-catalog.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1654,6 +1654,9 @@
<e:attribute name="select" required="no">
<e:data-type name="expression"/>
</e:attribute>
<e:attribute name="capture" required="no" default="'no'">
<e:data-type name="boolean" default="'no'"/>
</e:attribute>
<e:model name="sequence-constructor"/>
<e:allowed-parents>
<e:parent name="accumulator"/>
Expand Down
2 changes: 2 additions & 0 deletions specifications/xslt-40/src/schema-for-xslt40.xsd
Original file line number Diff line number Diff line change
Expand Up @@ -405,8 +405,10 @@ of problems processing the schema using various tools
</xs:restriction>
</xs:simpleType>
</xs:attribute>
<xs:attribute name="capture" type="xsl:yes-or-no" default="no"/>
<xs:attribute name="_match" type="xs:string"/>
<xs:attribute name="_phase" type="xs:string"/>
<xs:attribute name="_capture" type="xs:string"/>
<xs:assert test="exists(@match | @_match)"/>
</xs:extension>
</xs:complexContent>
Expand Down
122 changes: 117 additions & 5 deletions specifications/xslt-40/src/xslt.xml
Original file line number Diff line number Diff line change
Expand Up @@ -26039,7 +26039,8 @@ the same group, and the-->
which is expanded as described in <specref ref="qname"/>.</p>
<p>An <elcode>xsl:accumulator</elcode> declaration can only appear as a <termref def="dt-top-level"/> element in a stylesheet module.</p>


<p diff="add" at="issue211">The <code>capture</code> attribute is allowed only on an <elcode>xsl:accumulator-rule</elcode> element
that specifies <code>phase="end"</code>. Its effect is described in <specref ref="capturing-accumulators"/>.</p>



Expand Down Expand Up @@ -26128,7 +26129,9 @@ the same group, and the-->
<termref def="dt-type-error">type error</termref> occurs if conversion is not
possible. The <code>as</code> attribute defaults to <code>item()*</code>.</p>

<p>The effect of the <code>streamable</code> attribute is defined in <specref ref="streamability-of-accumulators"/>.</p>
<p>The effect of the <code>streamable</code>
<phrase diff="add" at="issue211">and <code>capture</code></phrase> attributes
are defined in <specref ref="streamability-of-accumulators"/>.</p>

</div3>
<div3 id="applicability-of-accumulators">
Expand Down Expand Up @@ -26358,6 +26361,11 @@ the same group, and the-->
available for use within the <code>select</code> expression or contained sequence
constructor.</p>
</note>
<note diff="add" at="issue211">
<p>There is a slight variation here for an accumulator rule specifying
<code>phase="end"</code> and <code>capture="yes"</code>. For details,
see <specref ref="streamability-of-accumulators"/>.</p>
</note>
</item>
</olist>
</item>
Expand Down Expand Up @@ -26450,9 +26458,87 @@ the same group, and the-->



</div3>
<div3 id="capturing-accumulators">
<head>Capturing Accumulators</head>

<p>The <code>capture</code> attribute is intended primarily for use with streamable accumulators, but
in the interests of consistency, it has the same effect both for streamable and non-streamable
accumulators. If an accumulator rule with <code>phase="end"</code> specifies <code>capture="yes"</code>,
then the rule is evaluated not with the matched node as the context item, but rather with a snapshot
copy of the matched node. The snapshot copy is made following the rules of the <function>snapshot</function>
function, with one exception: no accumulator values are copied into the snapshot tree (which would otherwise
happen: see <specref ref="copying-accumulators"/>).</p>

<note><p>The principal effect of specifying <code>capture="yes"</code> is to relax
the rules for streamability. With this option, the <code>phase="end"</code> accumulator rule
has access to the full subtree rooted at the node being visited. In a typical implementation,
a streaming processor encountering an element that matches a capturing accumulator rule
will make an on-the-fly in-memory copy of that element, allowing the <code>phase="end"</code>
accumulator rule full access to the subtree, and also to attributes of ancestors.</p>

<p>This means that an accumulator that needs access to the typed value or string value of an element
can get this directly with a rule that matches the element, avoiding the need
to write rules that match the element’s text node children.
</p>

<p>For example, to capture a copy of the most recent <code>h2</code> element in a document,
the following accumulator might be declared:</p>

<eg><![CDATA[<xsl:accumulator name="most-recent-h2" streamable="yes">
<xsl:accumulator-rule match="h2" phase="end" capture="yes" select="."/>
</xsl:accumulator>]]></eg>

<p>and subsequent processing wishing to copy the most recent <code>h2</code> element into the result
tree can simply use <code>&lt;xsl:copy-of select="accumulator-before('most-recent-h2')"/></code>.</p>

<p>Without the <code>capture="yes"</code> attribute, this accumulator would be rejected
as non-streamable, because the <code>select</code> expression on the accumulator rule
is consuming.</p>

</note>

<example id="use-accumulator-to-create-glossary">
<head>Using a capturing accumulator to construct a glossary</head>
<p>Suppose a document contains definitions of technical terms with markup such as:</p>
<eg><![CDATA[<define term="oxidation">In <topic>chemistry</topic>,
<term>oxidation</term> is a chemical process in which atoms lose electrons.</define>]]></eg>
<p>and the requirement is to generate a glossary that lists all the defined terms in the document, as an appendix.</p>
<p>This can be achieved by capturing all the defined terms in a map:</p>
<eg><![CDATA[<xsl:accumulator name="glossary-terms"
as="map{xs:string, element(define)}"
initial-value="map{}"
streamable="yes">
<xsl:accumulator-rule match="define[@term]"
phase="end"
capture="yes"
select="map:put($value, @term, .)"/>
</xsl:accumulator>]]></eg>
<p>Suppose that the input XML document contains an element <code>&lt;glossary/></code> marking
the point where the glossary is to be inserted. The glossary can then be generated
using a template rule such as:</p>
<eg><![CDATA[<xsl:template match="glossary">
<h2>Glossary</h2>
<dl>
<xsl:for-each select="map:pairs(accumulator-before('glossary-terms'))">
<xsl:sort select="?key"/>
<dt>{?key}</dt>
<dd><xsl:apply-templates select="?value"/></dd>
</xsl:for-each>
</dl>
</xsl:template>]]></eg>


</example>




</div3>
<div3 id="streamability-of-accumulators">
<head>Streamability of Accumulators</head>



<p>An accumulator is <termref def="dt-guaranteed-streamable"/> if
it satisfies all the following
Expand All @@ -26473,10 +26559,31 @@ the same group, and the-->
<termref def="dt-motionless"/>.</p>
</item>
<item>
<p>The <termref def="dt-expression">expression</termref> in the <code>select</code> attribute or contained
sequence constructor is <termref def="dt-grounded"/> and
<termref def="dt-motionless"/>.</p>
<p><phrase diff="add" at="issue211">In an <elcode>xsl:accumulator-rule</elcode> with
<code>phase="start"</code> (the default value),</phrase>
the <termref def="dt-type-adjusted-posture-and-sweep"/> of
the <termref def="dt-expression">expression</termref> in the <code>select</code> attribute or the contained
<termref def="dt-sequence-constructor"/>, with respect to the declared type of the accumulator,
is <termref def="dt-grounded"/> and <termref def="dt-motionless"/>.</p>
</item>
<item diff="add" at="issue211">
<p>In an <elcode>xsl:accumulator-rule</elcode> with
<code>phase="end"</code>, one of the
following conditions holds:</p>
<olist>
<item><p>The rule has <code>capture="no"</code> (the default value),
and the <termref def="dt-type-adjusted-posture-and-sweep"/> of
the <termref def="dt-expression">expression</termref> in the <code>select</code> attribute or the contained
<termref def="dt-sequence-constructor"/>, with respect to the declared type of the accumulator,
is <termref def="dt-grounded"/> and <termref def="dt-motionless"/>.</p></item>
<item><p>The rule has <code>capture="yes"</code> and the <termref def="dt-sweep"/> of
the <termref def="dt-expression">expression</termref> in the <code>select</code> attribute or the contained
<termref def="dt-sequence-constructor"/>
is <termref def="dt-consuming"/> or <termref def="dt-motionless"/>.</p></item>
</olist>
</item>



</olist>

Expand Down Expand Up @@ -39233,10 +39340,15 @@ See <loc href="http://www.w3.org/TR/xhtml11/"/>
<item><p>Simplified stylesheets no longer require an <code>xsl:version</code> attribute
(which means they might not need a declaration of the XSLT namespace). Unless otherwise
specified, a 4.0 simplified stylesheet defaults <code>expand-text</code> to <code>true</code>.</p></item>

<item><p>A new set of built-in template rules is introduced, invoked using
<code>&lt;xsl:mode on-no-match="shallow-copy-all"></code>. This is designed to allow rule-based recursive
transformation of JSON data structures (trees of maps and arrays) to work in the same way as with
XML-derived data structures.</p></item>

<item><p>The streamability rules for accumulators have been relaxed, so that the <code>phase="end"</code>
processing has access to the full subtree of the matched node.</p></item>

</olist>
</div3>
</div2>
Expand Down

0 comments on commit 35a08e0

Please sign in to comment.