-
-
Notifications
You must be signed in to change notification settings - Fork 29
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: update slur list and metadata README.md (#609)
- Loading branch information
1 parent
1760d01
commit 65a45be
Showing
1 changed file
with
56 additions
and
17 deletions.
There are no files selected for viewing
73 changes: 56 additions & 17 deletions
73
browser-extension/plugin/scripts/2023-12-21-slur-metadata/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,20 +1,59 @@ | ||
## Metadata Annotations for Slur List | ||
## 📋 A Guide to the Slur List | ||
|
||
All the metadata annotations and a few new slur contributions are present in the `slur_list.csv` file. | ||
These are the columns of the `slur_list.csv` file. | ||
The slur list was created from 2021-22 by 30 researchers and activists in Indian languages in the process of building a robust dataset for the plugin's features. | ||
|
||
In 2023, gender and feminist rights organisations partnered with us to add slur words and contribute to metadata on the slurs by participating in online annotation sessions. We are continuing to expand our slur list in this iteration. | ||
|
||
At present the slur list consists of **630 words** in **4 languages**: Indian-English, Hindi, Tamil and Malayalam. | ||
|
||
The slur list has been populated in the following ways: | ||
|
||
a) Annotation sessions: Sessions involve online meetings with annotators who provide context on the meaning and usage of slurs in everyday life on an excel sheet we share with them. We also encourage participants to interact and discuss aspects of annotations as they do so, and find it improves ease of annotation and provides a sense of cohesiveness to the exercise. | ||
|
||
b) Separately, the plugin has also been updated to allow users to participate in crowdsourcing as well: | ||
|
||
<img width="531" alt="Screenshot 2024-08-02 at 11 24 05 AM(1)" src="https://github.com/user-attachments/assets/bd299603-7b9d-43ac-82ca-0d5fef27bbce"> | ||
|
||
A user can add slurs they encounter which they believe should have been redacted, and also shows them their list of slurs over the course of time as they add more. This list is custom to each user, and the slurs they contribute may be incorporated to our slur list database as well. | ||
|
||
We are actively looking to work with more partner organisations as well to continue our work in a sustained manner- so if you're interested in partnering with us, do reach out! | ||
|
||
**🧭 How to navigate the slur list:** | ||
|
||
All the metadata annotations and a few new slur contributions are present in the [data.csv](https://github.com/tattle-made/Uli/blob/1760d01660dc5e7c20453edbe580e9315382c691/browser-extension/plugin/scripts/2023-12-21-slur-metadata/data.csv#L4) file, under an Open Data Licence. These categories of metadata which have been annotated under are the following columns of the [data.csv](https://github.com/tattle-made/Uli/blob/1760d01660dc5e7c20453edbe580e9315382c691/browser-extension/plugin/scripts/2023-12-21-slur-metadata/data.csv#L4) file: | ||
|
||
Slur Word | ||
|
||
``` | ||
Slur Word | ||
Annotator ID | ||
Level of Severity | ||
Casual | ||
Appropriated | ||
If, Appropriated, Is it by Community or Others? | ||
What Makes it Problematic? | ||
Category 1 | ||
Category 2 | ||
Category 3 | ||
Category 4 | ||
``` | ||
|
||
The `Slur Word` column contains all the slur words. The `Annotator ID` contains the id for the contributor who annotated that slur, this ID is annynomized. All the other columns are metadata fields for the slur, a detailed explanation about them can be found in the `annotations guideline`. | ||
|
||
Level of Severity | ||
|
||
Casual | ||
|
||
Appropriated | ||
|
||
If, Appropriated, Is it by Community or Others? | ||
|
||
What Makes it Problematic? | ||
|
||
Type of slur: | ||
|
||
- Category 1 (ex: gendered) | ||
- Category 2 (ex: sexualised) | ||
- Category 3 (ex: casteist) | ||
- Category 4 (ex: ableist) | ||
|
||
The Slur Word column contains all the slur words. The Annotator ID contains the id for the contributor who annotated that slur, this ID is anynomized. All the other columns are metadata fields for the slur, a detailed explanation about them can be found in the [annotation guideline.](https://docs.google.com/document/d/18H4TlLFB2GXK054oMj1uXVJ2OCFW08Gi/edit) | ||
|
||
**Which list do I use?** | ||
|
||
The slur lists are available on our Github repo [here:](https://github.com/tattle-made/Uli/tree/1760d01660dc5e7c20453edbe580e9315382c691/browser-extension/plugin/scripts) | ||
|
||
The first version of the slur list from our work in 2021-22 is available [here:](https://github.com/tattle-made/Uli/blob/1760d01660dc5e7c20453edbe580e9315382c691/browser-extension/plugin/scripts/slur_list_withspace.txt) | ||
|
||
If you'd like to view the slur list with the metadata annotations and new slur contributions, you can take a look [here:](https://github.com/tattle-made/Uli/tree/main/browser-extension/plugin/scripts/2023-12-21-slur-metadata) | ||
|
||
**𖡎 Feedback:** | ||
|
||
We'd love to have your input on the slur list. Drop us a comment on the Discussions section | ||
[here:](https://github.com/tattle-made/Uli/discussions/605) 💬 |