boilerpipe Boilerplate Removal and Fulltext Extraction from HTML pages NOTE: This is a work-in-progress transmit from Google Code. The latest stable version of boilerpipe is available at https://code.google.com/p/boilerpipe.