Discrepancies in data #19

jamesjhdixon · 2023-01-16T12:56:47Z

jamesjhdixon
Jan 16, 2023
Collaborator

as discussed in data creation WG mtg 16/01/2023

The Kenyan vehicle fleet numbers are different in the official govt. statistics (Statistical Abstract) and the TrIGGER dataset compiled by University of Nairobi and GIZ.

According to the Statistical Abstract, the ‘total registrations by year’ in 2015 were 993,090 motorcycles and 847,745 cars. In the TriGGER dataset, it says 539,768 motorbikes and 532,406 cars in the same year.

This is a case that demonstrates how we (TDCI) may want to refrain from making a 'judgement' on the quality of individual datasets. In this case, the obvious 'blue tick' version would be the government dataset, but we have suspicions that the GIZ/UoN dataset might be more accurate because it takes into account de-registrations. On the other hand, the motorisation rate derived from the GIZ/UoN dataset seems lower than expected. So, it's difficult! Alternatively from making a judgement, we could just provide a forum (e.g. youtube comments section) to allow these kinds of discussions with the aim of keeping users informed.

khaeru · 2023-01-16T13:06:04Z

khaeru
Jan 16, 2023
Maintainer

One way I could imagine handling this is having not just one set of judgements/criteria, but potentially several.

So the user could choose between (with one value being the default):

Prefer data points from official sources.
Prefer data points that have been validated/corrected by research/scientific groups.
(Other heuristic.)

…and they would then see one or another compiled data set corresponding to their choice; while the others and the pooled data ("messy pile") would also remain available.

1 reply

khaeru Jan 16, 2023
Maintainer

From a research point of view, this would make it really easy to do a policy-relevant sensitivity analysis: change the setting; get a data set with exactly the same format/coverage but different values; plug it into one's analysis; look at changes in the results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transport Data Commons

Discrepancies in data #19

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Transport Data Commons

Discrepancies in data #19

jamesjhdixon Jan 16, 2023 Collaborator

Replies: 1 comment · 1 reply

khaeru Jan 16, 2023 Maintainer

khaeru Jan 16, 2023 Maintainer

jamesjhdixon
Jan 16, 2023
Collaborator

Replies: 1 comment 1 reply

khaeru
Jan 16, 2023
Maintainer

khaeru Jan 16, 2023
Maintainer