-
Notifications
You must be signed in to change notification settings - Fork 8
/
CHANGELOG
34 lines (26 loc) · 879 Bytes
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
1.1.2
-----
* Fix issue with logging while forcing OCR on PDF documents
1.1.1
-----
* Update to tika 1.23
* Add dockerhub image and update documentation on its use: https://hub.docker.com/r/gradiant/faro
* Fix #32: logging duplicates
* Fix #37 : fixing metadata when a list is extracted in some fields (dates and pages)
1.1.0
-----
* Add OCR capabilities
* Add option to disable OCR for performance reasons
* Let tika handle the supported file formats
* Allow for basic document classification adding metadata to ouput: type of doc, author, creation date, filesize, etc.
* Rewrite metadata handling
* Move log and OCR configuration to envvars to integrate better with docker
1.0.1
-----
* Add Docker support
* Fix path with spaces issue
* Fix sensitivy information patterns and redesign two phase approach
* Add more contextual validations
1.0.0
-----
* Initial release.