Accessing the Census Bureau API with tidycensus and pytidycensus
Author
Corey S. Sparks, Ph.D.
Introduction
This document provides a brief demonstration of how to access the U.S. Census Bureau’s API using the tidycensus package in R and the pytidycensus package for Python.
The examples focus on retrieving median household income from the American Community Survey (ACS), but the same approach can be extended to other datasets and variables.
To make the workflow reproducible, the document also includes instructions on how to look up available variables in ACS, so that you can easily find the correct codes for the measures you are interested in.
Both R and Python examples are presented in tabbed code blocks for easy comparison.
Querying data from the Census API for the American Community Survey (ACS)
# Install if needed:# install.packages("tidycensus")library(tidycensus)library(dplyr)# Set your Census API key (replace with your own)# census_api_key("YOUR_KEY_HERE", install = TRUE)# Get median household income from ACS 5-year (2022) for all statesincome_data <-get_acs(geography ="state",variables ="B19013_001", # Median household incomeyear =2022)head(income_data, n=20)
# Import modulesimport matplotlib.pyplot as pltimport pandas as pdimport geopandas as gpdimport pytidycensus as tcimport os
# Set API key for pytidycensus (replace with your own)tc.set_census_api_key("YOUR_API_KEY")
# Get median household income from ACS 5-year (2022) for all statesstate_income = tc.get_acs( geography="state", variables=["B19013_001E"], year=2022, output="wide")
# Lookup available variables for ACS 5-year 2022vars <-load_variables(2022, "acs5", cache =TRUE)# Search for "income" related variablesdplyr::filter(vars, grepl("income", label, ignore.case =TRUE)) |>head(10)
# A tibble: 10 × 4
name label concept geography
<chr> <chr> <chr> <chr>
1 B06010PR_002 Estimate!!Total:!!No income Place … <NA>
2 B06010PR_003 Estimate!!Total:!!With income: Place … <NA>
3 B06010PR_004 Estimate!!Total:!!With income:!!$1 to $9,999 … Place … <NA>
4 B06010PR_005 Estimate!!Total:!!With income:!!$10,000 to $1… Place … <NA>
5 B06010PR_006 Estimate!!Total:!!With income:!!$15,000 to $2… Place … <NA>
6 B06010PR_007 Estimate!!Total:!!With income:!!$25,000 to $3… Place … <NA>
7 B06010PR_008 Estimate!!Total:!!With income:!!$35,000 to $4… Place … <NA>
8 B06010PR_009 Estimate!!Total:!!With income:!!$50,000 to $6… Place … <NA>
9 B06010PR_010 Estimate!!Total:!!With income:!!$65,000 to $7… Place … <NA>
10 B06010PR_011 Estimate!!Total:!!With income:!!$75,000 or mo… Place … <NA>
# Search for income-related variablesvars= tc.load_variables(2022, "acs", "acs5")
Loaded cached variables for 2022 acs acs5
income_vars =vars[vars["label"].str.contains("income", case=False, na=False)]# Show first 10print(income_vars.head(100))