Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xml filter xpath silent failure #10

Open
jordansissel opened this issue May 16, 2015 · 2 comments
Open

xml filter xpath silent failure #10

jordansissel opened this issue May 16, 2015 · 2 comments

Comments

@jordansissel
Copy link
Contributor

(originally posted in elastic/logstash#1688 by @SleeperSmith)

The offending line is at line 93:
begin
doc = Nokogiri::XML(value)
rescue => e
event.tag("_xmlparsefailure")
@logger.warn("Trouble parsing xml", :source => @source, :value => value,
:exception => e, :backtrace => e.backtrace)
return
end

When Nokogiri fails, it does not throw, but instead it puts the error into an "errors" property. So you need to check for doc.errors for parse failures.

This especially problematic in situation where Nokogiri fails to parse but XmlSimple succeeds. Logstash would pump out the log with the structure expanded but none of the xpath would work obviously. (The exact problem i encountered.) The offending character in my case was etc with error message "#".

P.S. I don't do ruby, so I can't really do a bug fix and pull request.

@rafaltrojniak
Copy link

I belive I bumped in to this issue (was quite frustrating).
When woring on https://discuss.elastic.co/t/xml-filter-help-required/1387 I tried that :
Event:

{"format":"xml_xpath","message":"<stats><stats xmlns='jcs:stats:jsm'><current-online-user-count>1730</current-online-user-count><login-rate>0</login-rate><successful_logins>93645</successful_logins><failed_logins>84583</failed_logins><uptime>1900999</uptime></stats>\n<stats xmlns='jcs:stats:delivery'><total-message-packets>5428196</total-message-packets><total-presence-packets>288328380</total-presence-packets><total-iq-packets>4977074</total-iq-packets><messages-in-last-time-slice>0</messages-in-last-time-slice><average-message-size>0</average-message-size></stats></stats>"}

Filters:

filter{
  if [format] == "xml_xpath" {
     xml {
          source => "message"
          target => "message_parsed"
          add_tag => ["xml_parsed"]
          xpath => [
            "/stats/stats/failed_logins/text()", "x_failed_logins"
            ]
     }
  }
}

Result : no error, no x_failed_logins entry.
When I had removed xmlns=.... params, The x_failed_logins appered.

I was able to test it here :
https://github.com/rafaltrojniak/logstash_rules/tree/xml_xpath

Please see the example inputs/outputs here. The first one(without xmlns) works, the second one (With xmlns) does not work
https://github.com/rafaltrojniak/logstash_rules/blob/xml_xpath/doc.md#example-sources

@wiibaa
Copy link
Contributor

wiibaa commented May 25, 2016

@rafaltrojniak your issue is different, because you use a namespace in inner element you should use a config to either

  1. remove all namespace prior executing the xpath =>
xml {
  source => "xmldata"
  target => "data"
  xpath => [ "/stats/stats/failed_logins/text()", "x_failed_logins" ]
  remove_namespaces => true
}
  1. register your namespace and use it in your xpath expression =>
xml {
  source => "xmldata"
  target => "data"
  namespaces => { "a" => "jcs:stats:jsm"}
  xpath => [ "/stats/a:stats/a:failed_logins/text()", "x_failed_logins" ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants