Introduction to Tidygeocoder

The tidygeocoder package provides an intuitive tidyverse-style interface for geocoder services. Currently the US Census and Nominatim (OSM) services are supported. The US Census service requires a street level address that is located in the United States. The OSM service does not have these restrictions, but has usage limits that will cause you to be locked out of the service temporarily. Because of these usage limits the default service for the geocode() function is the US Census which we will use to geocode a few street addresses below.

library(dplyr)
library(tidygeocoder)
library(knitr)

Geocode the addresses in our ‘sample_addresses’ dataset:

lat_longs <- sample_addresses %>% 
  geocode(addr,lat=latitude,long=longitude)

Latitude and longitude columns are attached to our input dataset:

kable(lat_longs)
name addr latitude longitude
White House 1600 Pennsylvania Ave Washington, DC 38.89875 -77.03535
Transamerica Pyramid 600 Montgomery St, San Francisco, CA 94111 37.79470 -122.40314
NA Fake Address NA NA
NA NA NA NA
NA NA
US City Nashville,TN NA NA
Willis Tower 233 S Wacker Dr, Chicago, IL 60606 41.87851 -87.63666
International City Nairobi, Kenya NA NA

Note that non-US addresses and non-street addresses were not found since we are using the US Census geocoder service.

if ((require("ggplot2") & require("maps") & require("ggrepel"))) {

ggplot(lat_longs %>% filter(!is.na(longitude)),aes(longitude,latitude),color="grey98") +
  borders("state") +
  theme_classic() +
  geom_point() +
  theme(line = element_blank(),
        text = element_blank(),
        title = element_blank()) +
  geom_label_repel(aes(label =name),show.legend=F) +
  scale_x_continuous(breaks = NULL) + 
  scale_y_continuous(breaks = NULL)
}

#ggsave("us_map.png",width=8,height=5)

To find non-US addresses and non-street addresses we can use the OSM service. The ‘cascade’ method will first attempt to use the US Census method for each address and only use the OSM service if the Census method fails (the Census service is tried first since the OSM service has usage limits).

cascade_points <- sample_addresses %>% 
  geocode(addr,method='cascade')
kable(cascade_points)
name addr lat long geo_method
White House 1600 Pennsylvania Ave Washington, DC 38.898754 -77.03535 census
Transamerica Pyramid 600 Montgomery St, San Francisco, CA 94111 37.794700 -122.40314 census
NA Fake Address NA NA NA
NA NA NA NA NA
NA NA NA
US City Nashville,TN 36.162230 -86.77435 osm
Willis Tower 233 S Wacker Dr, Chicago, IL 60606 41.878513 -87.63666 census
International City Nairobi, Kenya -1.283253 36.81724 osm