View all modules About the Innovation Skills Accelerator About the Office of Innovation Contact us


MODULE 5 (PART 1 and 2)


In Part 1 and Part 2 of the Data Analytical Thinking learning module, we are going to explore how quantitative methods – analyzing data – are essential to understanding and solving public problems. We're also going to explore how to apply a process for using data analytical skills, and identify key risks when using data. Specifically, we’ll dive into how infusing quantitative research into policy making practices has allowed government to dramatically improve policy outcomes, and discuss how people are crucial to data analytical processes (and why collaboration is key). We're also going to examine a three-step process for using data to define the problem better. At the end of this learning module, we hope that you’ll be able to describe how data can be used to help solve public problems, define a process for applying data analytical thinking, and identify key risks and mitigations when using data to solve public problems.


Data Analytical Thinking at the New Jersey Department of Human Services

In this interview, Commissioner Carole Johnson explains how the New Jersey Department of Human Services uses data to inform its decision-making process. She provides an example to illustrate how the Department's data analytical mindset helps improve its policies. In addition, she also describes innovative projects which the Department has developed to deliver better services for New Jersey residents.



Thumbnail of interview with Stephanie Wade


Carole Johnson was nominated to be Commissioner of the Department of Human Services by Governor Phil Murphy. She previously served in the Obama White House as senior health policy advisor and member of the Domestic Policy Council health team. There, she worked to increase health insurance coverage for millions of Americans, improve services and choices for individuals with disabilities, expand supports for older Americans, increase coverage of mental health and substance use disorder treatment, and improve health and economic security for all Americans. The Commissioner also has served on Capitol Hill working for the U.S. Senate Special Committee on Aging and for members of the U.S. Senate Finance Committee and U.S. House of Representatives Ways and Means Committee. In addition, she managed health care workforce policy issues for the U.S. Department of Health and Human Services Health Resources and Services Administration. Johnson previously was policy director for the Alliance of Community Health Plans, an association of nonprofit health plans; program officer with the Pew Charitable Trusts Health and Human Services Program; health policy researcher at the George Washington University; and, senior government relations manager with the American Heart Association.


Big Data and Social Science: A Practical Guide to Methods and Tools

Ian Foster, Rayid Ghani, Ron S. Jarmin, Frauke Kreuter, and Julia Lane

Statistics in the Social and Behavioral Sciences Series

Chapman & Hall/CRC, 2016

The goal of this book is to provide social scientists with an understanding of the key elements of this new science, its value, and the opportunities for doing better work. The goal is also to identify the many ways in which the analytical toolkits possessed by social scientists can be brought to bear to enhance the generalizability of the work done by computer scientists.

Read the full article here.

The Future of Data and Analytics

Shelley H. Metzenbaum

IBM Business of Goverment

How should federal, state, and local governments use and communicate data and analytics in the future to improve government performance across mul- tiple dimensions—including impact, return on spending, fairness, interaction quality, trust and understanding? What needs to be done to get from where we are now to where we want to be?

Read the full article here.

"Promoting Policies that work: Six steps for the Commission on Evidence-Based Policymaking

Quentin Palfrey


THE HILL, 01 Nov. 2016

A list of six concrete steps policymakers can take to institutionalize the use of administrative data to support policy-relevant research and evidence-informed policymaking.

Read the full article here.


Solving Public Problems with Data


The explosion in the availability of new sources of data and the emergence of new data science technologies for making use of such data are expected to have a significant impact on public institutions and how they solve problems and make decisions. Solving Public Problems with Data examines how data can be used to improve decision-making and problem solving in the public sector. Through real world examples and lectures from renowned experts and practitioners, SPPD discusses the fundamental principles of data science to help foster a data analytic mindset. The goal is to enable you to define and leverage the value of data to achieve your public mission. No prior experience with computer science or statistics is necessary or assumed.

Watch the full video here.



Choose the correct answer.

By analyzing data relating to murder rate, crime rate, educational attainment, unemployment rate and recidivism rate, organized by neighborhood, city government staff in New Orleans were able to target interventions to an _________ set of people.

Correct! The key was using data to increase the focus of interventions - and it helped deliver a decline in the murder rate by 20% between 2012 and 2013 and, by 2018, the lowest level in almost fifty years.

Unfortunately, that’s not correct. That may have been the case but what is it that helps us to target specific sets of people?


Choose the best answer.

To improve response times by first responders, Boston’s Mayor launched an initiative to change the way ambulances were deployed. By looking at data such as types of emergencies, routes, and response times using 911 calls they were better able to understand the problem and its_______:

That’s right! Often the greatest value data can provide is a better insight into why a problem is occurring, enabling us to develop better responses to it.

Sorry, that’s not correct. Hint: if we want to address a problem more effectively we need to do more than treat the symptoms of it.


Choose the best answer.

Data analytical thinking works optimally when combined with human-centered design and ______ approaches to decision-making:

Correct! Even though we might use a lot of data and algorithms to help us analyse it, collaboration among people remains core to the process.

Sorry, that’s not it. Hint: People are still crucial in data analytical thinking.


Choose the best answer.

________ has made it easier to collect, analyze and use information, even by people without extensive technical skills.

Correct! Digitization has enabled huge amounts of data to be made available to more people and spurred the development of tools and training and specialist units to help us use it effectively.

Unfortunately, that’s not correct. It’s likely this helps but there’s one thing that has massively improved the accessibility of data that has made so much more possible.


Choose the best answer.

By analyzing _______ data, Ben Wellington was able to argue that New York actually has a “rush day,” not just a ”rush hour.”

Correct! One of the benefits of analyzing data is that it can help us generate important insights and tell a compelling story.

Sorry that’s not it. Although it's possible this source of data could be used, it’s not what Ben used.


Choose the correct response

Using data collected by mobile phone to collect information about the conditions of refugees helps the UN World Food Programme deal with what challenging problem it has traditionally confronted?

That’s right! One of the hardest problems with aiding refugees is gaining real-time insight that makes it easier to prioritize the allocation of resources.

Sorry that’s not it. HINT: Part of the value of data is getting it in a timely fashion.

You're right but there's more to this answer. Please try again!


True or False

To draw correct inferences data must always be a representative sample of the population.

Correct! When we want to analyze a large population by only looking at a small group then we may need representative data. But if we just want to understand something about a specific group, nonrepresentative data may be adequate.

Sorry, that’s not correct.

You're right but there's more to this answer. Please try again!


Choose the correct answer.

When Chicago used its data on restaurant inspections to help it target enforcement activities, the effectiveness of its inspections increased by:

Correct! With only 3 dozen inspectors and more than 15,000 food establishments, intelligent use of data proved to be a major help in making the most of scarce resources.

Sorry, that’s not correct. It was more than that.

You're right but there's more to this answer. Please try again!


Choose the correct answer.

By analyzing data such as overflowing trash bins, food poisoning cases, tree debris, and building vacancies, as well as actual sightings of rats, researchers were able to create a model for the City of Chicago that could _______ rat infestations.

Correct! After running the model for a trial period, the City of Chicago claimed it saw this method to be 20% more effective than traditional baiting methods for catching rats.

Sorry, that might be helpful but there’s one thing here that is even more helpful for planning.

You're right but there's more to this answer. Please try again!


Choose the correct answer.

In recognition of the valuable role data plays in improving the work of government, the US Congress established a Commission on ______ Policymaking which recommended changes to strengthen government’s ability to make use of its own data.

Correct! Data is key to generating evidence that can help policymaking to be smarter, better and more reliable!

Sorry, that’s not correct. HInt: What does the use of data allow us to generate that makes our arguments for change stronger.

You're right but there's more to this answer. Please try again!


Choose the correct answer.

The first step in the data analytical thinking process is to define your _________.

Correct! We first start with a suggestion about what we think happens then find and analyze data that tells us to what extent that is true.

Unfortunately, that’s not correct. It’s one of the steps but it doesn’t come first.

You're right but there's more to this answer. Please try again!


Choose the correct answer.

A hypothesis is _______ suggestion that something is caused by something else.

That’s right! Creating a hypothesis is the foundation for being more scientific in our approach to creating better policies and services rather than relying on assumptions and best guesses.

Sorry, that’s not correct. Hint: We’re looking to see whether our assertion that something is caused by something else is true.

You're right but there's more to this answer. Please try again!


True or False

The Safecast project, which collected and shared data related to the 2011 Fukushima Daiichi Nuclear Disaster in Japan, is an example of crowdsourcing data.

Correct! Citizens used handheld Geiger counters to collect data which was monitored and shared through the Safecast project. Crowdsourcing and citizen science methods like this can generate data when it is not reliably available from existing sources.

Sorry, that’s not correct.

You're right but there's more to this answer. Please try again!


Choose the best answer

When Professor Fred Wulczyn and his team at the University of Chicago created the Foster Care Data Archive to generate the evidence needed to support strong foster care programs, it was an example of using _______ data to develop better services.

Correct! Administrative data is that data which governments and other bodies - such as hospitals and schools - collect about people when delivering services and which policymakers can often access to help test their hypotheses.

Unfortunately, that’s not correct. Here we’re looking for a wide range of data that is routinely collected but may not always be made public.

You're right but there's more to this answer. Please try again!


Choose all that apply

Key advantages of working with “Data Labs” or “Policy Labs” - institutions with small groups of data analysts working inside or in tandem with government agencies - include the potential to:

That’s right! All of these things are ways that these labs can help.

Sorry that’s not right. There’s more that these labs can help with.

You're right but there's more to this answer. Please try again!


Choose the best answer

To verify the accuracy of any single data source in relation to a topic we should ________ the data, meaning we use more than one method to collect data on the same topic.

That’s correct! By using multiple sources for the same information, and comparing what those sources indicate, we can avoid drawing the wrong conclusions that arise from over-reliance on a single source.

Sorry that’s incorrect. HINT: It’s all about getting the data into a suitable “shape”!

You're right but there's more to this answer. Please try again!


Choose the best answer

Natural experiments and randomized controlled trials (RCTs) are both ways of studying differences in the effects of an intervention on groups within our population of interest. What distinguishes RCTs is the __________ sorting of the population into two or more groups.

Correct! In a RCT the researcher intentionally sort groups so that one or more gets an intervention and one does not. This enables the researcher to analyze the difference that the intervention makes. In a natural experiment we can use naturally occurring factors (like income or age) to create groups and analyze differences that an intervention has.

Sorry, that’s not it. HINT: RCTs involve a conscious decision of the researcher to ensure one group does not receive the intervention.

You're right but there's more to this answer. Please try again!


True or False

In the age of big data, data analysis is complex and must always be undertaken by trained data scientists to ensure reliable conclusions are drawn from the data.

Correct! Sometimes data analysis is just about simple things that most of us can do - like counting to find out what things happen the most or the least and then exploring why that might be the case.

Sorry, that’s not right. Certainly, there are some times when data scientists can help but there are some simple ways of analyzing data that all of us can do too.

You're right but there's more to this answer. Please try again!


True or False

Incomplete data simply refers to data that has been collected but not yet entered into published data sets.

Correct! Incomplete data can also include missing data - data that is not even collected - such as the example of incomplete data on white collar crime when compared with other crimes such as murder, rape, assault, robbery, etc. We need to think about what information or population groups might be missing from our data to avoid setting policy priorities based on an incomplete picture of what is actually happening.

Sorry that’s incorrect - it goes beyond what hasn’t been published.

You're right but there's more to this answer. Please try again!


Choose the best answer

IBM estimates that, worldwide,____ of the data we collect goes unused.

Correct! The greatest data risk is not using it at all - so it seems we have a long way to go in making better use of the data we collect.

Sorry, that’s not correct. It was more than that.

You're right but there's more to this answer. Please try again!