
Data for Children Proof of Concept
Author(s) -
Lynda Cooper,
Rose Elliot
Publication year - 2020
Publication title -
international journal of population data science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.602
H-Index - 7
ISSN - 2399-4908
DOI - 10.23889/ijpds.v5i5.1511
Subject(s) - census , record linkage , linked data , consistency (knowledge bases) , geography , linkage (software) , demography , computer science , population , sociology , information retrieval , biochemistry , semantic web , chemistry , artificial intelligence , gene
The Data for Children Proof of Concept Build was developed as a collaborative project between the Children’s Commissioner’s Office, the Office for National Statistics (ONS) and the Administrative Data Research Partnership (ADRP) to address the challenge of identifying vulnerable children in England and Wales and related gaps in existing data available for research.
Objectives and ApproachThis project tested the feasibility of linking longitudinal data on pupils from the All Education Dataset for England (AEDE) to 2011 Census data to identify other household members and enable more accurate measurement of the household structures that shape children’s experiences.
A subset of 2.25 million individual pupils aged thirteen to eighteen from the AEDE for the 2010/11 academic year was used for the linkage because this year aligned to the Census date.
Using pseudonymised data, a series of matchkeys containing different combinations of name, date of birth, gender and postcode, was used to link the AEDE to the Census using the standard ADR linkage method.
ResultsFrom the subset of 2.25 million pupils identified in the 2010/11 data, just over two million (90%) linked to a Census record, two thirds of which matched on the strongest matchkey. Age distribution analysis showed there was consistency between the AEDE subset, the linked file and the AEDE records which did not link. Further analysis suggested that there was no issue with boarder status where records did not link.
Conclusion / ImplicationsThis research showed that a high proportion of records matched between these datasets, and the majority of matches were made on strong matchkeys, giving confidence that those matches are correct, such that the feasibility of the linkage has been demonstrated. It was recommended that this research should continue.