An AI-driven population census is the most wide-ranging data collection exercise a state can undertake. The census is a dashboard for statistics, measurements, and indicators produced during the intercensal period. It enables the statistical institute of the State to collect information from nearly every member of the population on a wide range of topics.
The Age of AI calls for the digitalisation of “inter alia”, all administrative records, including those held by the Land Registry, the Registrar of Marriages, Births and Deaths, Lands and Surveys, State Lands, District Revenue Services, building footprints using GIS and drone technologies, Postal Codes, and data from the Office of the Registrar of Migrants.
The census is the spine of the national statistical system and the statistical institute of the State. It contributes to strengthening the State’s capacity to gather quality socio-economic and demographic data regarding reliability, timeliness, and relevance.
An end-to-end digital census model entails using digital wallets for enumerators in the field, Census Apps to send notifications to the entire project team, and employing e-learning methodologies to train census staff and inform citizens about the tools and technologies utilised in an AI-driven and digitally grounded census.
It encompasses web-based interviews using Computer-Assisted Personal Interviewing (CAPI). This tool allows enumerators to conduct the survey using smartphones with AI-powered features that enhance the user experience, including personal assistance, as well as tablets to administer questionnaires and record responses during face-to-face interviews.
CAPI offers several advantages, such as real-time data entry, reduced inaccuracies and mistakes, and the ability to incorporate visual and audio aids during field exercises. To reach respondents who are at work, those in remote locations with internet access, and those travelling overseas, Computer-Assisted Web Interviewing (CAWI) adds another layer to the digital census toolkit.
CAWI enables respondents to complete online questionnaires via a web link, which can be accessed on smartphones, desktops, and tablets. These links ensure privacy and convenience, reduce the need for face-to-face interactions or physical paper surveys, and boost public participation.
Utilising GIS technologies, a digital census can effectively create Enumeration Districts (EDs) consisting of 150 to 200 housing units. In a digital census, a housing unit serves as a “unique identifier” for a household, which is distinct from an e-identity.
For a digital census, in countries like Uruguay, each housing unit is assigned a unique ID linked to the physical address and the individuals within. A housing unit is thus defined as a separate and independent living area, typically a studio, house, or even a room, where a person or a group of people resides independently from others.
It is generally characterised by having a private entrance, a kitchen, and a bathroom, or the ability to isolate itself from other living areas within a larger structure. The unique identification number assigned by the census connects the physical address to its occupants throughout the census process.
This ensures accuracy in data collection and analysis that can subsequently inform policy decisions, the construction of vulnerability indices, and other ancillary dashboards to manage the wellness and well-being of the population.
All of these inputs and technological advancements in conducting a national census share a single aim–reducing errors and striving for an accurate representation of the facts as they exist on the ground.
It also prevents the tragedy of miscounts and puts a spotlight on individuals on the margins, the overlooked, the unbanked, and the underserved. Progress in reaching the socially minimised has been slow, and the stakes are not low. In Latin America and the Caribbean, efforts are underway to modernise census data gathering, largely through automation in the Age of AI.
Ahead of the census, the State may explore building and training an AI model. The model must be trained using a range of existing data lakes. To actualise this intention, the State may consider open and big data models. Generative AI models, which can enhance the accessibility of census data, require vast amounts of training data.
AI training can utilise various data sources, but the quality and relevance of the training data are crucial for the performance and reliability of the AI model. Generative AI can create synthetic data, helping to protect sensitive information while making census data accessible to policymakers.
Explainable AI (XAI) aids in identifying and mitigating biases within AI models, ensuring fairness and trustworthiness in the analysis of census data. Machine Learning (ML) models can be trained to analyse specific demographic groups, census foci, and even predict data trends. AI models can inherit biases from the data they are trained on, so careful consideration of fairness and trustworthiness is essential.
Finally, protecting the privacy of individuals while using AI to analyse census data is crucial. This highlights that the core features of a digital census are data security, data sovereignty, cybersecurity, privacy, and the human right to digital erasure.
Census data provides a basis on which to formulate policies and make long-term plans for development, particularly concerning demographic, social, and economic issues. Training AI for a national census involves equipping AI models with the ability to analyse and understand census data, potentially improving data collection, analysis, and accessibility.
Dr Fazal Ali completed his Masters in Philosophy at the University of the West Indies, he was a Commonwealth Scholar who attended the University of Cambridge, Hughes Hall, provost of the University of Trinidad and Tobago and the acting president and chairman of the Teaching Service Commission. He is presently a consultant with the IDB. He can be reached at fazalalitsc@gmail.com
