CensusIRL: Historical census data preparation with MDD support
Census returns are a critical source of information for governments globally. They underpin a wide spectrum of public planning including health, housing, work and education. Historically, census forms have captured names, places, dates, age, occupation, family structure, and religion. In more recent times, sexual orientation and ethnicity, queries that can be intrusive to vulnerable communities, have been added to the criteria, and for such reasons data security is of paramount importance. Most governments restrict access to individual census returns, presenting the data in aggregate report format. The Irish government is particularly strict, enforcing a statutory closure period of 100 years. An exception was made for the Irish 1911 census which were digitised and released for free online consultation in 2009 [1]. They are an excellent source for genealogists and historians alike but exist as separate digital siloes. This project uses an eXtreme Model-Driven Development (XMDD) environment to create linkages between both datasets. It will discuss the development process of the CensusIrl application and the process used in developing the matching algorithm used.We will discuss the census records and the data cleansing process used in creating the initial proof of concept application. We detail the different approaches to the development life-cycle of the application and describe the different utilises used in the sanitation of data points in the records and the match-making process.
Funding
History
Publication
2022 IEEE International Conference on Big Data (Big Data), Osaka, Japan, 2022, pp. 2507-2514Publisher
IEEE Computer SocietyRights
© 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.”Also affiliated with
- Health Research Institute (HRI)
- LERO - The Science Foundation Ireland Research Centre for Software
Sustainable development goals
- (4) Quality Education
External identifier
Department or School
- History