Skip to content

Releases: vmenger/deduce

v3.0.3

16 Jul 14:18
1dc4b30
Compare
Choose a tag to compare

3.0.3 (2024-07-16)

Added

  • A cache_path option, to define the path for saving/loading the lookup structure cache. You should use this if your install directory is not writable.

Removed

  • the config_file keyword, now replaced by config which accepts both filenames and dicts
  • old lookup list names, e.g. prefixes now replaced by prefix
  • annotator types custom, regexp, token_pattern, dd_token_pattern and annotation_context, all replaced by setting class directly as annotator_type
  • everything in deduce.pattern, patient patterns now replaced by PatientNameAnnotator

v3.0.2

15 Feb 12:46
8d09277
Compare
Choose a tag to compare

3.0.2 (2023-02-15)

Changed

  • recognize 4+ spaces as a token, blocking annotations

v3.0.1

20 Dec 10:51
199bedd
Compare
Choose a tag to compare

3.0.1 (2023-12-20)

Fixed

  • a bug with packaging base_config.json

v3.0.0

20 Dec 10:37
8e0c7fa
Compare
Choose a tag to compare

3.0.0 (2023-12-20)

Added

  • speed optimizations, ~250%
  • pseudo-annotating eponymous diseases (e.g. Creutzfeldt-Jakob)
  • PatientNameAnnotator, which replaces deduce.pattern
  • a structured way for loading and building lookup structures (lists and tries), including caching
  • pre_match_words for some regexp annotators, speeding up the annotating
  • option to present a user config as dict (using config keyword)

Changed

  • speedup for TokenPatternAnnotator
  • some internals of ContextPatternAnnotator
  • initials now detected by lookup list, rather than pattern
  • redactor open and close chars from < > to [ ], as previous chars caused issues in html (so deidentified text now shows [PATIENT], [LOCATIE], etc.)
  • names of lookup structures to singular (prefix, rather than prefixes)
  • INSTELLING tag to ZIEKENHUIS and ZORGINSTELLING
  • refactored and simplified annotator loading, specifically the annotator_type config keyword now accepts references to classes (e.g deduce.annotator.TokenPatternAnnotator)
  • renamed interfix_with_capital annotator to interfix_with_name

Deprecated

  • the config_file keyword, now replaced by config which accepts both filenames and dicts
  • old lookup list names, e.g. prefixes now replaced by prefix
  • annotator types 'custom', 'regexp', 'token_pattern', 'dd_token_pattern' and 'annotation_context', all replaced by setting class directly as annotator_type

Removed

  • automated coverage reporting on coveralls.io
  • options lowercase_lookup, lowercase_neg_lookup for token patterns
  • everything in deduce.pattern, patient patterns now replaced by PatientNameAnnotator
  • utils.any_in_text

Fixed

  • some small additions/removals for specific lookup lists
  • smaller bugs related to overlapping matches

v2.5.0

28 Nov 13:02
e43b062
Compare
Choose a tag to compare

2.5.0 (2023-11-28)

Added

  • the RegexpPseudoAnnotator component for filtering regexp matches based on preceding/following words
  • a prefix_with_interfix pattern for names, detecting e.g. Dr. van Loon

Fixed

  • a bug with BsnAnnotator with non-digit characters in regexp

Changed

  • the age detection component, with improved logic and pseudo patterns
  • annotations are no longer counted adjacent when separated by a comma
  • streets are prioritized over names when merging overlapping annotations
  • removed some false positives for postal codes ending in gr or ie
  • extended the postbus pattern for xx.xxx format (old notation)
  • some smaller optimizations and exceptions for institution, hospital, placename, residence, medical term, first name, and last name lookup lists

v2.4.4

22 Nov 10:07
29d6b0b
Compare
Choose a tag to compare

2.4.2 (2023-11-22)

Changed

  • multi-token lookup for first- and last names, so multi token names are now detected
  • some small lookup list additions

v2.4.3

22 Nov 09:03
0a12310
Compare
Choose a tag to compare

2.4.3 (2023-11-22)

Changed

  • extended list of medical terms

v2.4.2

21 Nov 11:37
a94904f
Compare
Choose a tag to compare

2.4.2 (2023-11-21)

Changed

  • name lookup list contents, extending names and adding more exceptions

v2.4.1

15 Nov 14:46
2b28022
Compare
Choose a tag to compare

2.4.1 (2023-11-15)

Added

  • detection of initials Ch., Chr., Ph. and Th.

v2.4.0

15 Nov 13:44
4f74303
Compare
Choose a tag to compare

2.4.0 (2023-11-15)

Added

  • logic for detecting hospitals, with added whitelist and separate annotator

Changed

  • logic for detecting (non-hospital) institutions, with extended lookup list

Removed

  • the separate Altrecht annotator, now included in the lookup list