0:00
/
0:00
Transcript

Former U.S. Chief Data Scientist on Public Data Under Siege

A recording from Jeremy Ney's live video with Denice Ross

1 year ago this month, more than 8,000 government websites went dark in an effort to remove key public data and reports. In 2026, Dozens of datasets are still unavailable. I spoke with Denice Ross, the former U.S. Chief Data Scientist, about what this means for public policy in America and what we can all do to address these concerns for transparency, accountability, and effective decision making.

Give a gift subscription

00:00:00 – Introduction and Early Career in New Orleans

The conversation begins with Jeremy welcoming Denise Ross, former Chief Data Scientist of the United States. Ross explains that she started her career as a web designer in the late 90s and moved to New Orleans around 2000 to help democratize data from the 2000 decennial census. Her goal was to move data from behind closed doors and into the hands of communities so they could advocate for their own destinies rather than relying on statisticians in power.

00:01:22 – Addressing Disparities Before the Storm

Ross describes the four years leading up to Hurricane Katrina, during which her team built local capacity for nonprofits and City Hall to use data for grant writing and funding. She emphasizes that while New Orleans might look like a typical city on the surface, disaggregating data by geography and demographics revealed incredible disparities. This effort was a process of “begging” people to use data in a city where decisions were traditionally made based on personal connections rather than empirical evidence.

00:02:21 – Data Needs During the Katrina Catastrophe

When Hurricane Katrina hit in 2005, a failure in federal infrastructure caused flooding in 80% of the city, making all existing data instantly historical. As the old ways of making decisions were no longer viable, the city was “flying blind” and needed “good intel” to handle a surge of urgent decisions. Ross explains that she worked for a local data intermediary nonprofit that was inundated with questions about population locations, where to place temporary health clinics, and which parks needed the most urgent cleanup from toxic sludge.

00:03:04 – Seeking Answers from the Census Bureau

Ross recounts her journey to Washington D.C. shortly after the storm to ask the Census Bureau for a “special count” of New Orleans’ remaining residents. She learned that while cities can pay for a special census, it is an onerous and expensive process that would not have worked because the population was changing too rapidly. Instead, the Bureau provided technical assistance for a local survey of occupied houses, which revealed that people were commuting in to work on their homes during the day but leaving at night to sleep.

00:05:24 – The Birth of Open Data Advocacy

A formative moment for Ross occurred when she realized the daytime and nighttime population survey data was not made public; instead, it was shared only as a PDF for a small list of officials. She occasionally had to obtain “black market” versions of the PDF, despite there being no logical reason to keep the data secret. This lack of transparency, occurring before the open data movement was solidified by the Obama administration, highlighted the limitations of federal statistics in rapidly changing situations.

00:06:13 – Creative Proxy Data and Private Sector Solutions

To answer urgent repopulation questions, Ross’s team turned to proxy data, including nighttime lights, traffic patterns, and utility hookups. They discovered that the most accurate proxy came from a junk mail company that purchased and improved U.S. Postal Service data to avoid mailing empty houses. By aggregating this data at the block level, they could show repopulation progress monthly, helping nonprofits identify “tipping point” blocks to avoid the “jack-o’-lantern effect” where repopulation is too scattered to support infrastructure.

00:08:06 – The Rise of “Dearly Departed Data Sets”

The discussion shifts to Ross’s current work tracking the loss of government data, which she calls “dearly departed data sets”. About a year prior to the interview, thousands of government websites went dark, leading to the removal or scrubbing of data related to DEI (Diversity, Equity, and Inclusion), gender, and climate. Ross notes that while some data returned, it was often handled with “sloppy” methodology, such as changing column headers from “gender” to “sex” without updating the accompanying documentation.

00:10:55 – “Death by a Thousand Cuts”

Ross identifies a second, more subtle bucket of damage she calls “death by a thousand cuts,” which involves the loss of staffing and contracts necessary to maintain data collections. This degradation includes the termination of scientific advisory committees and a bottlenecking of government business due to new requirements for political approval for even basic contracts. This “gumming of the works” makes it difficult to track which data sets are suffering because the decline is slow and lacks flashy headlines.

00:14:37 – Specific Data Losses and Political Implications

A third category of data damage involves removing information that might suggest administration policies are failing. A primary example is the USDA food security supplement, which was canceled just as millions of Americans were losing food stamp eligibility. This data set had provided a 30-year record of whether adults and children had enough healthy food, and its loss sets a worrying precedent for canceling any data that reflects poorly on the government.

00:17:59 – Official Federal Statistics vs. Private Data

Ross addresses arguments regarding cost-cutting and the use of private sector data as a replacement for government surveys. While the private sector provided stopgap numbers during government shutdowns, those entities admitted they still depend on official federal statistics. Ross emphasizes that there is a constant tension between timeliness and quality; while private indices like the Real-Time Crime Index are useful for quick estimates, they cannot replace the robust, quality-checked data from the FBI or other federal agencies.

00:20:07 – Data Disaggregation as a Tool for Equity

A key attribute of American data is the commitment to disaggregation, which allows users to see disparities by race, income, and geography. As Chief Data Scientist, Ross worked to increase the capacity of federal agencies to slice data to see how policies were affecting different segments of the population. This “eyes wide open” approach was vital for identifying communities in the Appalachia and tribal nations that were not receiving fair distributions of investment.

00:25:33 – Risks of Obfuscating National Health Data

Ross notes a concerning trend where the CDC’s infant and maternal mortality data is no longer aggregated at the national level, existing only in disparate state reports. This change followed the firing of all 17 staff members in that office, introducing massive friction for researchers who must now manually calculate national figures. Ross argues that while states and civil society can try to act as a “heart-lung machine” to keep these collections alive, this is a function that belongs inside the federal government.

00:30:46 – The Weaponization of Shared Data

The conversation touches on the Paperwork Reduction Act and the desire for more efficient data sharing across agencies. Ross warns that this efficiency can be a double-edged sword; if data collected for Medicaid or SNAP is shared with ICE for immigration enforcement, it has a “chilling effect” on the entire ecosystem. This distrust leads to lower survey response rates and decreased participation in essential services, creating a downward cycle of under-investment in vulnerable communities.

00:33:12 – Initiatives to Safeguard Data Infrastructure

Ross describes her current initiatives, including the website EssentialData.us, which features nearly 100 use cases showing how federal data benefits everyday Americans. She also mentions DataIndex.us, a newsletter that monitors obscure websites for opportunities for public input on data changes. These efforts are designed to turn data users into data advocates, ensuring that federal data stewards receive feedback on the value of the information they maintain.

00:38:57 – Invisible Data Impact: From Bats to Peanut Butter

In her closing remarks, Ross highlights how data touches lives in unseen ways, such as the Consumer Product Safety Commission API that allows grocery stores to automatically email customers about salmonella recalls. Her favorite dataset is the North American Bat Monitoring Database, which is critical for the economy because bats provide billions of dollars in free pest control for farmers. Research shows that when bat populations decline, infant mortality rises because farmers use more pesticides, demonstrating the deep, often hidden connections between data and human health.


Discussion about this video

User's avatar

Ready for more?