Two Studies Find Major Issues with Soon to Be Released Census Data Due to Differential Privacy

Two Studies Find Major Issues with Soon to Be Released Census Data Due to Differential Privacy

A team of researchers at Harvard University and a separate team at the University of Minnesota have published reports offering a troubling assessment of the U.S. Census Bureau’s use of differential privacy or disclosure avoidance system (DAS) to block would-be hackers from identifying the personal information of everyone who participated in the 2020 census. In short, both conclude that the soon-to-be-released census data will not be accurate enough for redistricting. Read both studies here: Harvard Study. University of Minnesota Study.

These reports and others are responses solicited by the Census Bureau in its effort to promote transparency and obtain feedback from the data user community. Specifically, the bureau encouraged users to respond to specific questions which can be found on the bureau’s site under “DAS Demonstration Data and Progress Metrics Updates​”

The Minnesota study, which a less comprehensive one compared to the Harvard study – found “pervasive biases and inconsistencies, high levels of inaccuracy in the counts of minority populations, and isolated large errors in the population counts for particular communities”. It concluded that the data is unfit for many research and administrative purposes. Similiarly, the Harvard study concluded that the latest DAS algorithm released by the bureau:

Prevents map drawers from creating districts of equal population, according to current
statutory and judicial standards
. Actual deviations from equal population will generally be
several times larger than as reported under the DAS data. The magnitude of this problem increases
for smaller districts such as state legislative districts and school boards.
Transfers population from low-turnout, mixed-party areas to high-turnout, single-party
areas
. This differential bias leads to different district boundaries, which in turn implies significant
and unpredictable differences in election results. The discrepancy also degrades the ability of analysts
to reliably identify partisan gerrymanders.
Transfers population from racially mixed areas to racially segregated areas. This bias
effectively means racially heterogeneous areas are under-counted. The degree of racial segregation
can therefore be over-estimated, which can lead to a change in the number of majority-minority
districts. It also creates significant precinct-level variability, which adds substantial unpredictability
to whether or not a minority voter is included in a majority-minority district.
Alters individual-level race predictions constructed from voter names and addresses.
This leads to fewer estimated minority voters and majority-minority districts in a re-analysis of
a recent Voting Rights Act case, NAACP v. East Ramapo School District. At a statewide level,
however, the DAS data does not curb the ability of algorithms to identify the race of voters from
names and addresses. Therefore, this casts doubt on the universal privacy protection guarantee of
DAS data.

Get updates by email:

Related Posts