R tidycensus download all block groups
I am looking to automate the process of loading census data from all block groups from the US using the tidycensus package. Developer instructions for downloading all paths to the US , however, the groups cannot be accessed using the same...
Here is my current code which doesn't work
library(tidyverse)
library(tidycensus)
census_api_key("key here")
# create lists of state and county codes
data("fips_codes")
temp <- data.frame(state = as.character(fips_codes$state_code),
county = fips_codes$county_code,
stringsAsFactors = F)
temp <- aggregate(county~state, temp, c)
state <- temp$state
coun <- temp$county
# use map2_df to loop through the files, similar to the "tract" data pull
home <- map2_df(state, coun, function(x,y) {
get_acs(geography = "block group", variables = "B25038_001", #random var
state = x,county = y)
})
Resulting error
No encoding supplied: defaulting to UTF-8.
Error: parse error: premature EOF
(right here) ------^
A similar approach for converting counties within each state to a list also doesn't work
temp <- aggregate(county~state, temp, c)
state <- temp$state
coun <- temp$county
df<- map2_df(state, coun, function(x,y) {
get_acs(geography = "block group", variables = "B25038_001",
state = x,county = y)
})
Returns Error: Result 1 is not a length 1 atomic vector
...
Does anyone have any insight on how this can be accomplished? Most likely I am not using functions correctly or syntax, and I am not very good with loops either. Any help would be appreciated.
source to share
Try this package: totalcensus
at https://github.com/GL-Li/totalcensus . It downloads census data files to your own computer and extracts any data from those files. After configuring folders and paths, run the code below if you want all block group data in 2015 to be conducted by ACS with a 5 year review.
library(totalcensus)
# download the 2015 ACS 5-year survey data, which is about 50 GB.
download_census("acs5year", 2015)
# read block group data of variable B25038_001 from all states plus DC
block_groups <- read_acs5year(
year = 2015,
states = states_DC,
table_contents = "B25038_001",
summary_level = "block group"
)
Extracted data 217739 block groups of all states and DC:
# GEOID lon lat state population B25038_001 GEOCOMP SUMLEV NAME
# 1: 15000US020130001001 -164.1232 54.80448 AK 982 91 all 150 Block Group 1, Census Tract 1, Aleutians East Borough, Alaska
# 2: 15000US020130001002 -161.1786 55.60224 AK 1116 247 all 150 Block Group 2, Census Tract 1, Aleutians East Borough, Alaska
# 3: 15000US020130001003 -160.0655 55.13399 AK 1206 352 all 150 Block Group 3, Census Tract 1, Aleutians East Borough, Alaska
# 4: 15000US020160001001 178.3388 51.95945 AK 1065 264 all 150 Block Group 1, Census Tract 1, Aleutians West Census Area, Alaska
# 5: 15000US020160002001 -166.8899 53.85881 AK 2038 380 all 150 Block Group 1, Census Tract 2, Aleutians West Census Area, Alaska
# ---
# 217735: 15000US560459511001 -104.7889 43.99520 WY 1392 651 all 150 Block Group 1, Census Tract 9511, Weston County, Wyoming
# 217736: 15000US560459511002 -104.4785 43.76853 WY 2050 742 all 150 Block Group 2, Census Tract 9511, Weston County, Wyoming
# 217737: 15000US560459513001 -104.2575 43.88160 WY 1291 520 all 150 Block Group 1, Census Tract 9513, Weston County, Wyoming
# 217738: 15000US560459513002 -104.1807 43.85406 WY 1046 526 all 150 Block Group 2, Census Tract 9513, Weston County, Wyoming
# 217739: 15000US560459513003 -104.2601 43.84355 WY 1373 547 all 150 Block Group 3, Census Tract 9513, Weston County, Wyoming
source to share
The solution was provided by the author tidycensus
(Kyle Walker) and looks like this:
Unfortunately this does not work at the moment. If that works, your code will need to identify the counties in each state within the function assessed with
map_df
, and then stitches the dataset county by county and state by state. The problem is that block group data is only available by county, so you have to go through all 3000+ counties in the US in turn. If that worked, a successful call would look like this:
library(tigris)
library(tidyverse)
library(tidycensus)
library(sf)
ctys <- counties(cb = TRUE)
state_codes <- unique(fips_codes$state_code)[1:51]
bgs <- map_df(state_codes, function(state_code) {
state <- filter(ctys, STATEFP == state_code)
county_codes <- state$COUNTYFP
get_acs(geography = "block group", variables = "B25038_001",
state = state_code, county = county_codes)
})
The problem is that while I have internal logic to allow multi-state or multi-state calls, tidycensus cannot yet handle multi-state and multi-count calls at the same time.
source to share