Skip to content

Refactoring the iText2KG code

Latest
Compare
Choose a tag to compare
@lairgiyassir lairgiyassir released this 09 Oct 13:45
· 1 commit to main since this release

-The entire iText2KG code has been refactored by adding data models that describe an Entity, a Relationship, and a Knowledge Graph.

  • Each entity is embedded using both its name and label to avoid merging concepts with similar names but different labels, such as Python: Language and Python: Snake.
  • The weights for entity name embedding and entity label are configurable, with defaults set to 0.4 for the entity label and 0.6 for the entity name.
  • A max_tries parameter has been added to the iText2KG.build_graph function for entity and relation extraction to prevent hallucinatory effects in structuring the output. A max_tries_isolated_entities parameter has been added to the same method to handle hallucinatory effects when processing isolated entities.