Recent research into the effect of differential privacy (DP) on Alaska’s census data show concerning results, especially for small area geographies such as townships, municipalities and other local jurisdictions. The authors of the study, a summary of which is posted in the blog for the Population Association of America, warn that local redistricting and other activities connected to census data could be substantially impacted by the Census Bureau’s application of differential privacy on raw census data.
The authors of the study, David Swanson, Professor Emeritus, Sociology at the University of California – Riverside, T.M. Bryan of Bryan Demographic Research, Richmond, VA, and Richard Sewell, statewide aviation policy planner at the Alaska Department of Transportation and Public Facilities in Anchorage, AK, examined the errors introduced by DP on 2010 Census SF block data for Alaska in the form of four case studies.
The study used a demonstration file provided by the Census Bureau for the purpose of allowing stakeholders to familiarize themselves with how DP changes characteristics within a dataset. Currently, the bureau has provided several demonstration files that consist of 2010 census population data treated with various iterations of DP. Stakeholders can compare the demonstration file with the current 2010 census data – which has not been treated with DP – to see the difference. Highlights of this analysis of Alaska’s 45,292 census blocks include the following:
- Differential Privacy turned 1,252 blocks with one or more people of voting age into blocks with zero people of voting age;
- Differential Privacy turned 830 blocks with zero persons of voting age into blocks with one or more persons of voting age; and
- Of 12,870 blocks in which the 2010 census shows one or more persons, 12,366 of them (96%) show a different number of persons when DP is applied.
The authors conclude that should the Census Bureau implement DP in the same manner as the demonstration project, 2020 census data will be unusable for localities. The authors also agreed with other demographic experts that DP itself is not an appropriate privacy solution for census data:
“DP goes far beyond precedent and exceeds what is necessary to keep data safe under census law. They contend further that, because DP focuses on concealing individual characteristics instead of respondent identities, it is a blunt and inefficient instrument for disclosure control. .. the core metric of DP does not measure the risk of identity disclosure, so it cannot assess disclosure risk as defined under census law, making it untenable for optimizing the privacy/usability trade-off.”
The state of Alabama is the first state to sue over the Census Bureau’s use of DP on the still-to-be-released 2020 census data.