add changelog, improve readmes, code to push model and gitignore

GRAAL-Research · Nov 22, 2023 · 39c002d · 39c002d
1 parent 0d0cc39
commit 39c002d
Show file tree

Hide file tree

Showing 5 changed files with 54 additions and 5 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,5 @@
 .idea/*
 
 __pycache__/
+
+meaningbert
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -0,0 +1,5 @@
+## Beta (0.1)
+
+- Initial release of the MeaningBERT weights usable with HuggingFace Model Card and Metrics.
+
+## dev
diff --git a/README.md b/README.md
@@ -1,9 +1,20 @@
+---
+title: MeaningBERT
+emoji: 🦀
+colorFrom: purple
+colorTo: indigo
+sdk: gradio
+sdk_version: 4.2.0
+app_file: app.py
+pinned: false
+---
+
 <div align="center">
 
 [![pr welcome](https://img.shields.io/badge/PR-Welcome-%23FF8300.svg?)](https://img.shields.io/badge/PR-Welcome-%23FF8300.svg?)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
 
-[![Download](https://img.shields.io/badge/Download%20Dataset-blue?style=for-the-badge&logo=download)](https://github.com/GRAAL-Research/csmd)
+[![Dataset Download](https://img.shields.io/badge/Download%20Dataset-blue?style=for-the-badge&logo=download)](https://github.com/GRAAL-Research/csmd)
 
 </div>
 
@@ -12,7 +23,15 @@
 MeaningBERT is an automatic and trainable metric for assessing meaning preservation between sentences. MeaningBERT was
 proposed in our
 article [MeaningBERT: assessing meaning preservation between sentences](https://www.frontiersin.org/articles/10.3389/frai.2023.1223924/full).
-Its goal is to assess meaning preservation between two sentences that correlate highly with human judgments and sanity checks. For more details, refer to our publicly available article.
+Its goal is to assess meaning preservation between two sentences that correlate highly with human judgments and sanity
+checks. For more details, refer to our publicly available article.
+
+> This public version of our model uses the best model trained (where in our article, we present the performance results
+> of an average of 10 models) for a more extended period (1,000 epochs instead of 250). We have observed later that the
+> model can further reduce dev loss and increase performance.
+
+- [HuggingFace Model Card](https://huggingface.co/davebulaval/MeaningBERT)
+- [HuggingFace Metric Card]()
 
 ## Sanity Check
 
@@ -38,13 +57,27 @@ for computer floating-point inaccuracy, we round the ratings to the nearest inte
 
 Our second test evaluates meaning preservation between a source sentence and an unrelated sentence generated by a large
 language model.3 The idea is to verify that the metric finds a meaning preservation rating of 0 when given a completely
-irrelevant sentence mainly composed of irrelevant words (also known as word soup). Since this test's expected rating is 0, we check that the metric rating is lower or equal to a threshold value X∈[5, 1].
+irrelevant sentence mainly composed of irrelevant words (also known as word soup). Since this test's expected rating is
+0, we check that the metric rating is lower or equal to a threshold value X∈[5, 1].
 Again, to account for computer floating-point inaccuracy, we round the ratings to the nearest integer and do not use a
 a threshold value of 0%.
 
 ## Use MeaningBERT
 
-...
+You can use MeaningBERT as a [model](https://huggingface.co/davebulaval/MeaningBERT) that you can retrain or use for
+inference using the following with HuggingFace
+
+```python
+# Load model directly
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+
+tokenizer = AutoTokenizer.from_pretrained("davebulaval/MeaningBERT")
+model = AutoModelForSequenceClassification.from_pretrained("davebulaval/MeaningBERT")
+```
+
+or you can use MeaningBERT as a metric for evaluation (no retrain) using the following with HuggingFace
+
+
 
 
 ------------------

diff --git a/push_model_to_hub.py b/push_model_to_hub.py
@@ -0,0 +1,8 @@
+import os.path
+
+from transformers import AutoModelForSequenceClassification
+
+model_path = os.path.join("datastore", "V1")
+model = AutoModelForSequenceClassification.from_pretrained(model_path)
+
+model.push_to_hub("davebulaval/MeaningBERT")
diff --git a/src/README.md b/src/README.md
@@ -6,7 +6,8 @@ our [article](https://www.frontiersin.org/articles/10.3389/frai.2023.1223924/ful
 ## To Reproduce our Article Results
 
 The `src` directory is public to make our results more reproducible. One can reproduce our results by
-using the codebase. It was coded in Python 3.11.
+using the codebase. It was coded in Python 3.11. To reproduce the results, you can
+use `python few_shot_training.py --data_augmentation=True`.
 
 Note that this codebase is different from the one used for our article. In our article, we create ten different
 train/dev/test splits using a different seed each time, and we also create, on run-time, data augmentation generation