Acting after on_end_tag #108

mitsuhiko · 2021-11-28T20:37:13Z

I'm not sure if this is a feature request but I have tried using on_end_tag to do something after a tag has been closed. Unfortunately the handler is invoked before the tag is being written into the sink. This is intentional clearly as this lets the handler modify things like the tag name or append stuff behind the tag, but it also means that you cannot communicate into the sink easily.

My idea was to instruct the sink to output or not output content outside of an element of interest (eg: to "select" a certain element exclusively). I am thus flipping a flag on enter/leave. The result now is that my closing tag is no longer emitted.

I believe there are use cases where one wants to have code run after the tag has been closed and emitted tot he sink and I'm not sure if this is at all possible at the moment.

The text was updated successfully, but these errors were encountered:

jongiddy · 2021-12-16T21:16:43Z

I see you've proposed #109 to add this capability.

Curious as to why this didn't seem to work using the existing on_end_tag, I had a go at getting it to work. This is my code to display only the a tags in a document, including the start and end tags. I assume that you've got something similar to the can_write flag in your code. Adding the extra string was the only additional step needed. It might be considered slightly hacky that I construct the end tag string manually, but end tags are pretty simple.

I don't object to your proposed change. I just wanted to understand why it was a problem. Have I understood it correctly, or have I missed an aspect of the problem you're describing?

use lol_html::{element, HtmlRewriter, Settings};
use std::{cell::RefCell, error::Error, rc::Rc};

const PAGE: &str = "
<html>
This <a href=\"http://example.com\">link</a> is an example.
</html>
";

struct OutputHandler {
    can_write: bool,
    extra: String,
}
impl OutputHandler {
    fn on(&mut self) {
        self.can_write = true;
    }
    fn off(&mut self) {
        self.can_write = false;
    }
    fn push(&mut self, extra: &str) {
        self.extra.push_str(extra)
    }
}

fn main() -> Result<(), Box<dyn Error>> {
    let output = Rc::new(RefCell::new(OutputHandler {
        can_write: false,
        extra: String::new(),
    }));
    let element_content_handlers = vec![element!("a", |a| {
        output.borrow_mut().on();
        let output = output.clone();
        a.on_end_tag(move |tag| {
            let mut handler = output.borrow_mut();
            handler.push(&format!("</{}>", tag.name()));
            handler.off();
            Ok(())
        })?;
        Ok(())
    })];

    let output = output.clone();
    let mut rewriter = HtmlRewriter::new(
        Settings {
            element_content_handlers,
            ..Settings::default()
        },
        |chunk: &[u8]| {
            let mut handler = output.borrow_mut();
            if !handler.extra.is_empty() {
                print!("{}", handler.extra);
                handler.extra.clear();
            }
            if handler.can_write {
                print!("{}", String::from_utf8_lossy(chunk))
            }
        },
    );
    rewriter.write(PAGE.as_ref())?;
    rewriter.end()?;
    Ok(())
}

mitsuhiko · 2021-12-27T18:35:23Z

You're right in that it can be somewhat emulated but it's quite inconvenient. This solution now also always inserts a closing tag, even if that did not exist in the original document. For me the biggest issue was actually that I attempted to maintain a somewhat accurate tag stack to make more meaningful decisions and having the on_end fire "within" the stack level creates a lot of complexities.

However right now all of this is entirely blocked on #110 anyways. A solution to that might change the situation somewhat.

jongiddy · 2021-12-28T06:51:17Z

Would this be easier if, rather than having the on_end_tag method on the Element, there was a separate end tag handler like there is a separate element text handler? That was my original idea, but the on_end_tag callback was easier to implement.

mitsuhiko · 2021-12-29T19:44:59Z

Potentially. The current nice aspect of this on_end_tag business is that you can pass state from the start tag to the end tag somehow, but with the need to maintain a stack anyways that might not be necessary.

mitsuhiko · 2024-12-22T12:18:58Z

I came across this today and decided to close down #109 because too much has changed in the meantime. The problem still exists and #110 is still open.

This was referenced Nov 28, 2021

Emit events for closing elements #85

Closed

Added on_after_end_tag handler #109

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Acting after on_end_tag #108

Acting after on_end_tag #108

mitsuhiko commented Nov 28, 2021

jongiddy commented Dec 16, 2021

mitsuhiko commented Dec 27, 2021

jongiddy commented Dec 28, 2021

mitsuhiko commented Dec 29, 2021

mitsuhiko commented Dec 22, 2024

Acting after on_end_tag #108

Acting after on_end_tag #108

Comments

mitsuhiko commented Nov 28, 2021

jongiddy commented Dec 16, 2021

mitsuhiko commented Dec 27, 2021

jongiddy commented Dec 28, 2021

mitsuhiko commented Dec 29, 2021

mitsuhiko commented Dec 22, 2024