You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The html_text2 documentation says that: “Roughly speaking, it converts <br /> to "\n"”. But it seems that it only replaces the line break element with line breaks within block-level elements. Line break elements that are children of inline elements (text-level semantics) such as span, em etc. are not replaced by line breaks in the output.
The document in the following example is valid html markup according to the W3C validation service, but the html_text2 output does not successfully simulate how the text looks in a browser.
html<-'<!DOCTYPE html><html lang = "en"><head><meta charset="utf-8"><title>test</title></head><body><span>line 1<br>line 2</span></body></html>'testthat::test_that("br to newline within inline elements", {
testthat::expect_equal(rvest::html_text2(rvest::read_html(html)),
"line 1\nline 2")
})
The text was updated successfully, but these errors were encountered:
The
html_text2
documentation says that: “Roughly speaking, it converts<br />
to "\n"”. But it seems that it only replaces the line break element with line breaks within block-level elements. Line break elements that are children of inline elements (text-level semantics) such asspan
,em
etc. are not replaced by line breaks in the output.The document in the following example is valid html markup according to the W3C validation service, but the
html_text2
output does not successfully simulate how the text looks in a browser.The text was updated successfully, but these errors were encountered: