-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"<" character in regexp body breaks parsing of XML #414
Comments
Ouch! That is an unforeseen effect of the quick fixes to tackle invalid-xml detection. Thanks for the report. |
Thine here need encode XML entities.
|
Think it is not bug. |
Hm. Thanks for clarifying @sergey-safarov. You are right.
So indeed. If I amend some docs/FAQ then that should resolve this issue. |
Yeah, another case of "but it has been working that way ever since the stone age!", so although I know that I have to escape them in html, I didn't give a thought to this being potentially related, given that it's between quotes and given that in SIP context the "<" is used quite often. So I think a fat red WARNING in the part of the manual which explains the use of Off topic, when talking about the manual, I think the remark that an IPv6 extension of cygwin from win6.jp is required for successful compilation is obsolete given that it happily compiles without it (and that the extension in question went missing), most likely because the extension has become an intrinsic part of cygwin in the meantime. |
As suggestion we can call XML validator before parsing XML files.
If provided XML file is broken, then used will get reference to error. As option we ca use same XML validator library that used in xmllint. |
That sounds great, as it will pinpoint the actual error rather than giving a confusing output. |
Good day, we also encountered this problem recently and it was a little bit unexpected for us. My personal point of view that whether this is a bug or not, it could be very nice of you to mention it in release notes with other breaking changes, as this worked in a previouce version. |
A reasonable request 👍 |
@wdoekes can you please point me to the file/line for this change in 3.6.0 (or the PR for this fix)? I would like to revert that part back to 3.5. We have over 200 broken xmls which were working fine before this fix. I know that wasn't the right way to write the xmls, but I really don't want to spend time changing each xml one by one. Though I would start creating new ones with this in mind. |
I don't expect you to rewrite 200 files manually no. But perhaps a oneliner would help. $ cat bad.xml
<!-- blah -->
<this is bad="regex <.*>"/><this is not/><but this="<is>"></but>
<blah/>
<and this="<is>"></and> And: $ python3 -c 'import cgi,re,sys;r=re.compile('\''"([^"]*)"'\'');print(r.sub((lambda x:cgi.escape(x.group(0))),sys.stdin.read()),end="")' < bad.xml
<!-- blah -->
<this is bad="regex <.*>"/><this is not/><but this="<is>"></but>
<blah/>
<and this="<is>"></and> |
I think the commits got this covered. Leaving it pinned for now. |
Hello,
when trying to isolate the uri alone (i.e. no display name), I was used to use
regexp="<(.*)>"
on the To:, From:, Contact: etc. headers. With 3.6 stable, the opening < is treated as a new tag, so when the closing "/>" for the "<ereg" tag (or the ">" if trying that way) is reached, I get an error like</(.*)> was expected
.Examples:
<action>
<ereg regexp=" *<(sip:.*)>" search_in="hdr" header="Contact:" assign_to="blackhole,call_contact" />
</action>
yields
Unexpected </action> (expected </ereg>)
<action>
<ereg regexp=" *<(sip:.*)>" search_in="hdr" header="Contact:" assign_to="blackhole,call_contact" ></ereg>
</action>
yields
Unexpected </ereg> (expected </(sip:.*)>)
The text was updated successfully, but these errors were encountered: