Releases: urchade/GLiNER
v0.2.3
What's Changed
- Distributed Data Parallel mode support by @Ingvarstep in #107
- add conda install instructions by @moritzwilksch in #112
- Add label smoothing by @Ingvarstep in #114
- Refactoring: more features support, GPU-friendly training and inference by @Ingvarstep in #119
New Contributors
- @moritzwilksch made their first contribution in #112
Full Changelog: v0.2.2...v0.2.3
v0.2.2
v0.2.1
What's Changed
- Add total limits on the number of checkpoints to save by @Ingvarstep in #98
- fix out of bounds by @urchade in #99
Full Changelog: v0.2.0...v0.2.1
v0.2.0
What's Changed
New architecture: Token-level GLiNER
-
Computes scores for the start, end, and inside positions of potential entity spans.:
scores_start, scores_end, scores_inside = self.compute_score_eval(x)
-
Converts these scores into probabilities and determines start and end positions where probabilities exceed a specified threshold.:
start_probs = torch.sigmoid(scores_start) end_probs = torch.sigmoid(scores_end) inside_probs = torch.sigmoid(scores_inside) start_indices = [torch.where(start_probs[i] > threshold) for i in range(len(x["tokens"]))] end_indices = [torch.where(end_probs[i] > threshold) for i in range(len(x["tokens"]))]
-
Match Start and End Indices to Create Valid Spans and ensures class label consistency and filters out low-confidence spans based on the inside scores.:
valid_spans = [] for i, (start, end, inside) in enumerate(zip(start_indices, end_indices, inside_probs)): spans = [] for st, cls_st in zip(*start): for ed, cls_ed in zip(*end): if ed >= st and cls_st == cls_ed: ins_confidence = inside[st:ed + 1, cls_st] if (ins_confidence < threshold).any(): continue spans.append((st, ed, x["id_to_classes"][cls_st + 1], ins_confidence.mean().item())) valid_spans.append(spans)
-
Uses a greedy search algorithm to finalize the list of entity spans (as in original GLiNER):
final_spans = [greedy_search(spans, flat_ner, multi_label=multi_label) for spans in valid_spans]
Full Changelog: v0.1.14...v0.2.0
v0.1.14
What's Changed
- fix for SimpleNamespace is not JSON serializable by @mkmohangb in #92
- docs: add EmergentMethods/gliner_medium_news-v2.1 to the README by @robcaulk in #93
- Generalize training by @Ingvarstep in #94
New Contributors
- @mkmohangb made their first contribution in #92
- @Ingvarstep made their first contribution in #94
Full Changelog: v0.1.13...v0.1.14
v0.1.13
What's Changed
- NuNerZero GLiNER based models added by @Serega6678 in #83
- Implementation of trainer class (fp16 training & gradient accumulation) by @MarcusLoppe in #85
- add synthetic_data_generation by @urchade in #86
New Contributors
- @Serega6678 made their first contribution in #83
- @MarcusLoppe made their first contribution in #85
- @urchade made their first contribution in #86
Full Changelog: 0.1.12...v0.1.13
0.1.12
v0.1.9
Fix batch predict error when text is empty by @tom-ph :
Full Changelog: v0.1.8...v0.1.9
v0.1.8
v0.1.7
- Correction of a subtle bug that caused a large decrease in performance:
span_prob = sorted(spans, key=lambda x: x[-1])
to span_prob = sorted(spans, key=lambda x: -x[-1])
in evaluator.py
Full Changelog: v0.1.6...v0.1.7