Skip to content

Commit

Permalink
docs: update slur list and metadata README.md (#609)
Browse files Browse the repository at this point in the history
  • Loading branch information
kaustubhavarma authored Aug 7, 2024
1 parent 1760d01 commit 65a45be
Showing 1 changed file with 56 additions and 17 deletions.
73 changes: 56 additions & 17 deletions browser-extension/plugin/scripts/2023-12-21-slur-metadata/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,59 @@
## Metadata Annotations for Slur List
## 📋 A Guide to the Slur List

All the metadata annotations and a few new slur contributions are present in the `slur_list.csv` file.
These are the columns of the `slur_list.csv` file.
The slur list was created from 2021-22 by 30 researchers and activists in Indian languages in the process of building a robust dataset for the plugin's features.

In 2023, gender and feminist rights organisations partnered with us to add slur words and contribute to metadata on the slurs by participating in online annotation sessions. We are continuing to expand our slur list in this iteration.

At present the slur list consists of **630 words** in **4 languages**: Indian-English, Hindi, Tamil and Malayalam.

The slur list has been populated in the following ways:

a) Annotation sessions: Sessions involve online meetings with annotators who provide context on the meaning and usage of slurs in everyday life on an excel sheet we share with them. We also encourage participants to interact and discuss aspects of annotations as they do so, and find it improves ease of annotation and provides a sense of cohesiveness to the exercise.

b) Separately, the plugin has also been updated to allow users to participate in crowdsourcing as well:

<img width="531" alt="Screenshot 2024-08-02 at 11 24 05 AM(1)" src="https://github.com/user-attachments/assets/bd299603-7b9d-43ac-82ca-0d5fef27bbce">

A user can add slurs they encounter which they believe should have been redacted, and also shows them their list of slurs over the course of time as they add more. This list is custom to each user, and the slurs they contribute may be incorporated to our slur list database as well.

We are actively looking to work with more partner organisations as well to continue our work in a sustained manner- so if you're interested in partnering with us, do reach out!

**🧭 How to navigate the slur list:**

All the metadata annotations and a few new slur contributions are present in the [data.csv](https://github.com/tattle-made/Uli/blob/1760d01660dc5e7c20453edbe580e9315382c691/browser-extension/plugin/scripts/2023-12-21-slur-metadata/data.csv#L4) file, under an Open Data Licence. These categories of metadata which have been annotated under are the following columns of the [data.csv](https://github.com/tattle-made/Uli/blob/1760d01660dc5e7c20453edbe580e9315382c691/browser-extension/plugin/scripts/2023-12-21-slur-metadata/data.csv#L4) file:

Slur Word

```
Slur Word
Annotator ID
Level of Severity
Casual
Appropriated
If, Appropriated, Is it by Community or Others?
What Makes it Problematic?
Category 1
Category 2
Category 3
Category 4
```

The `Slur Word` column contains all the slur words. The `Annotator ID` contains the id for the contributor who annotated that slur, this ID is annynomized. All the other columns are metadata fields for the slur, a detailed explanation about them can be found in the `annotations guideline`.

Level of Severity

Casual

Appropriated

If, Appropriated, Is it by Community or Others?

What Makes it Problematic?

Type of slur:

- Category 1 (ex: gendered)
- Category 2 (ex: sexualised)
- Category 3 (ex: casteist)
- Category 4 (ex: ableist)

The Slur Word column contains all the slur words. The Annotator ID contains the id for the contributor who annotated that slur, this ID is anynomized. All the other columns are metadata fields for the slur, a detailed explanation about them can be found in the [annotation guideline.](https://docs.google.com/document/d/18H4TlLFB2GXK054oMj1uXVJ2OCFW08Gi/edit)

**Which list do I use?**

The slur lists are available on our Github repo [here:](https://github.com/tattle-made/Uli/tree/1760d01660dc5e7c20453edbe580e9315382c691/browser-extension/plugin/scripts)

The first version of the slur list from our work in 2021-22 is available [here:](https://github.com/tattle-made/Uli/blob/1760d01660dc5e7c20453edbe580e9315382c691/browser-extension/plugin/scripts/slur_list_withspace.txt)

If you'd like to view the slur list with the metadata annotations and new slur contributions, you can take a look [here:](https://github.com/tattle-made/Uli/tree/main/browser-extension/plugin/scripts/2023-12-21-slur-metadata)

**𖡎 Feedback:**

We'd love to have your input on the slur list. Drop us a comment on the Discussions section
[here:](https://github.com/tattle-made/Uli/discussions/605) 💬

0 comments on commit 65a45be

Please sign in to comment.