Skip to content

Commit

Permalink
Merge pull request #40 from dfe-analytical-services/harmonisation-sex
Browse files Browse the repository at this point in the history
Added guidance on sex harmonisation based on star chamber policies
  • Loading branch information
rmbielby authored Feb 26, 2024
2 parents abdf3a1 + 95fb33a commit 2cc42d8
Show file tree
Hide file tree
Showing 7 changed files with 5,586 additions and 17 deletions.
1,009 changes: 1,009 additions & 0 deletions RAP/rap-faq.html

Large diffs are not rendered by default.

874 changes: 874 additions & 0 deletions index.html

Large diffs are not rendered by default.

1,128 changes: 1,128 additions & 0 deletions learning-development/git.html

Large diffs are not rendered by default.

1,530 changes: 1,530 additions & 0 deletions learning-development/r.html

Large diffs are not rendered by default.

938 changes: 938 additions & 0 deletions learning-development/sql.html

Large diffs are not rendered by default.

6 changes: 3 additions & 3 deletions renv.lock
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"R": {
"Version": "4.3.1",
"Version": "4.3.2",
"Repositories": [
{
"Name": "CRAN",
Expand Down Expand Up @@ -808,7 +808,7 @@
},
"vctrs": {
"Package": "vctrs",
"Version": "0.6.3",
"Version": "0.6.5",
"Source": "Repository",
"Repository": "CRAN",
"Requirements": [
Expand All @@ -818,7 +818,7 @@
"lifecycle",
"rlang"
],
"Hash": "d0ef2856b83dc33ea6e255caf6229ee2"
"Hash": "c03fa420630029418f7e6da3667aac4a"
},
"viridisLite": {
"Package": "viridisLite",
Expand Down
118 changes: 104 additions & 14 deletions statistics-production/ud.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -191,29 +191,29 @@ A single filter column should contain all the possible filter values for a singl

In general, analysts should use a separate column for each filter in accordance with tidy data principles. This is especially the case where data are presented for combinations of filters (i.e. cross tabulations). User testing has shown this to be the most effective way to structure data for the best user experience with the table tool.

| ... | FSM | Sex | pupil_count |
|-----|-----------|-----------|---------------|
| ... | Total | Total | 1209 |
| ... | Total | Female | 567 |
| ... | Total | Male | 642 |
| ... | FSM | Total | 406 |
| ... | FSM | Female | 203 |
| ... | FSM | Male | 203 |
| ... | non-FSM | Total | 803 |
| ... | non-FSM | Female | 364 |
| ... | non-FSM | Male | 439 |
| ... | fsm_status | sex | pupil_count |
|-----|------------|-----------|---------------|
| ... | Total | Total | 1209 |
| ... | Total | Female | 567 |
| ... | Total | Male | 642 |
| ... | FSM | Total | 406 |
| ... | FSM | Female | 203 |
| ... | FSM | Male | 203 |
| ... | non-FSM | Total | 803 |
| ... | non-FSM | Female | 364 |
| ... | non-FSM | Male | 439 |

Where data is broken down across combinations of different filters, teams should aim to "complete the matrix". This means that, for the given filters, all possible filter combinations should have a corresponding data entry. In this way, teams can prevent users getting the ambiguous "No data" result from EES and can instead provide more explicit codes for any missing data (e.g. not applicable, not available, suppressed, etc)

A possible exception to the above structure is where no filter combinations/cross-tabulations are present in a given data file. For example, this may be the case if a publication requires a highlights level table that shows a result across breakdowns of sex (Male, Female, etc) and Free School Meal status (FSM, non-FSM), but not combinations of the two (Female and FSM, Male and FSM, Female and non-FSM and Male and non-FSM). In this case, analysts may choose to use a overarching collated filter columns named breakdown_topic and breakdown as follows:

| ... | breakdown_topic | breakdown | pupil_count |
| ... | breakdown_topic | breakdown | pupil_count |
|-----|-----------------|-----------|---------------|
| ... | Total | Total | 1209 |
| ... | Sex | Female | 567 |
| ... | Sex | Male | 642 |
| ... | FSM | FSM | 406 |
| ... | FSM | non-FSM | 803 |
| ... | FSM status | FSM | 406 |
| ... | FSM status | non-FSM | 803 |

With the above structure, breakdown_topic should be added as the fitler_grouping_column for breakdown in the associated meta data file (and therefore should not have its own row in the meta data).

Expand Down Expand Up @@ -817,6 +817,96 @@ for dashboards and other secondary statistics services, whilst also helping with

---

## Sex and gender

The Department's policy on collecting sex and gender data is that statistics
should preferentially be presented for sex rather than gender. These terms, as
used in statistics publications, are defined in the following sections and
publication teams should be careful not to mislabel sex categories as gender or
vice-versa.

---

### Sex

---

Sex, as reported in DfE statistics, is defined as follows:

> a value which identifies the sex of a person as recognised in law,
i.e. the sex as recorded on a birth certificate (or on a gender recognition
certificate).

When presented in a filter, this should take the column header **sex** and
standard entries **Male**, **Female** and **Unknown**. This can also be included as
part of the standard **breakdown_topic** / **breakdown** filters, with the relevant
entry under **breakdown_topic** being **Sex** and as above, the entries under
**breakdown** being **Male**, **Female** and **Unknown**.

**Unknown** as an option is likely to be used rarely - it is appropriate on data
entry when the sex of the subject has not been recorded, is not known, or has
not been registered (for example in data relating to unborn children in the Children in Need data).

The above is in line with current DfE data collection guidelines on sex.

In an EES data file, these might look something like:

| ... | sex | pupil_count |
|-----|------------|-------------|
| ... | Total | 12 |
| ... | Female | 7 |
| ... | Male | 4 |
| ... | Unknown | 1 |

Or...

| ... | breakdown_topic | breakdown |pupil_count |
|-----|-----------------|------------|------------|
| ... | Sex | Total | 12 |
| ... | Sex | Female | 7 |
| ... | Sex | Male | 4 |
| ... | Sex | Unknown | 1 |


---

### Gender

---

Gender identity is [defined by the ONS](https://www.ons.gov.uk/methodology/classificationsandstandards/measuringequality/genderidentity)
as follows:

> Gender identity is a personal internal perception of oneself and, as such, the
gender category with which a person identifies may not match the sex they were
registered at birth.

The department is not collecting, and does not plan to collect, data on gender identity. However, the ambiguous use of the word "gender" in previous data collections may have affected the data collected. It is expected that, based on current policy on how the DfE collects data, teams should not be in a position to publish statistics containing the gender category once the new terminology has taken effect in all collections.

---

### Time-series containing historical data collected as "gender"

---

The GSS have updated their guidance on how sex and gender are defined as
terms in statistical publications. A number of publications within the DfE will
contain unclear data under that updated guidance.

As such, statistics producers should look to re-evaluate their categorisations
against the current GSS definitions and update categories in future statistics
products accordingly. This should include time-series published in any
statistics releases going forward.

Where teams perform a re-categorisation of what has previously been labelled as
gender to sex, this should be clearly stated in the publication
methodology, with explicit definitions of any time periods that have undergone
re-classification. The recommended text in such cases is as follows:

> Historical use of the word "gender" in data collections may have meant that "gender identity" was reported in some cases, as opposed to legal sex. While this is unlikely to have a significant effect on overall figures, it may affect figures in more granular subdivisions. The definitions used in the data collection relating to this publication were revised on [insert date relevant to the collection] and as such, time series that span [insert date range] and contain sex as a category may be affected.
---

## Ethnicity


Expand Down

0 comments on commit 2cc42d8

Please sign in to comment.