Skip to content

Latest commit

 

History

History
25 lines (13 loc) · 935 Bytes

README.md

File metadata and controls

25 lines (13 loc) · 935 Bytes

Groovy filters

This repository contains reusable filters that can be added to a Funnelback collection to extend the filtering.

Crawl filters

Crawl filters can be added to the main filter chain (filter.classes) and operate on whole documents as they are filtered during the gather phase.

See: Developing custom filters

Included crawl filters

  • CA extra filters: Additional content filters for use with the content auditor.

Jsoup filters

Jsoup filters can be added to the Jsoup filter chain (filter.jsoup.classes).

Jsoup filters are used to transform HTML documents by operating on a Jsoup object representing the HTML structure.

See: Jsoup filters

Included Jsoup filters

  • Metadata delimiters: Replace delimiters in specified metadata fields.