Part 1 url: https://michaeldgarber.github.io/teach-r/osmdata-for-bikes-part-1
Part 2 url: https://michaeldgarber.github.io/teach-r/osmdata-for-bikes-part-2
In part 1, I described OpenStreetMap’s data structure for bicycle infrastructure and paths. In this one, let’s dive into some R code to manage and visualize those data.
The package I use to get OpenStreetMap data is osmdata, authored by Mark Padgham and Robin Lovelace. The purpose of this package is to download vector data from OSM, for example the roadways and bike infrastructure described previously.
First, let’s get a feel for the osmdata package by downloading Atlanta’s roads. Then we’ll focus on a smaller area to explore bike infrastructure.
Install and load osmdata. I’ll also use the tidyverse, sf, mapview, and ggmap.
library(osmdata)
library(tidyverse)
library(sf)
library(mapview)
library(ggmap)
The three workhorse functions of this package are opq()
, add_osm_feature()
, and osmdata_sf()
.
opq()
, short for overpass query, uses the overpass API to query OpenStreetMap data. The great thing about this package is one need not know about the overpass API to make use of the data. The important argument it takes is a bounding box (bbox=
), which can be simply the name of a place like a city. Or, alternatively, four lat/lon values can be used. This function can be used on its own to simply request all of the data within a bounding box. It’s often more useful to request specific features, which is done using:add_osm_feature()
. This function takes a key
and returns objects with values for that key. Recall, a highway (i.e., a roadway or path) is a type of key.osm_data_sf()
passes the queries specified by the bounding box and the requested features to the overpass API and returns an sf
object.These three functions can be piped (%>%
) together. For example, suppose we’d like vector data corresponding to the primary roadways (minor arterials) in Atlanta. The .$osm_lines
line of code extracts the line-based geometry, specifically. Polygons might also be included.
atl_primary = opq(bbox = "Atlanta, Georgia, USA") %>%
add_osm_feature(key = "highway", value = "primary") %>%
osmdata_sf() %>%
.$osm_lines %>% #Extract the osm_lines, specifically.
st_as_sf() #Confirm it's sf.
atl_primary %>% mapview()
For context, let’s add secondary roads, tertiary roads, trunk roads, and the interstate highways. We’ll leave out residential roads, as the data become too big. The code is repetitive, but I’ve had better luck creating an sf
object for each highway type than calling all types in one query using sequential add_osm_feature()
function calls.
atl_motorway = opq(bbox = "Atlanta, Georgia, USA") %>%
add_osm_feature(key = "highway", value = "motorway") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
atl_trunk = opq(bbox = "Atlanta, Georgia, USA") %>%
add_osm_feature(key = "highway", value = "trunk") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
atl_secondary = opq(bbox = "Atlanta, Georgia, USA") %>%
add_osm_feature(key = "highway", value = "secondary") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
atl_tertiary = opq(bbox = "Atlanta, Georgia, USA") %>%
add_osm_feature(key = "highway", value = "tertiary") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
Each of those objects are sf
objects, so they can be manipulated using dplyr verbs.
class(atl_primary)
## [1] "sf" "data.frame"
Accordingly, we can bind the sf
data together using bind_rows()
.
atl_roads = atl_primary %>%
bind_rows(
atl_motorway,
atl_trunk,
atl_secondary,
atl_tertiary
)
#Take a look at the number of features of each highway type.
table(atl_roads$highway)
##
## motorway motorway_link primary secondary tertiary
## 1234 7 1255 3899 2854
## trunk
## 496
To visualize these all together, I’m going to pare down to a 1-mile radius around Five Points, as I was having issues rendering the larger mapview
to html. In your own console, this simple call to mapview(atl_roads)
should work.
atl_roads %>%
mapview(zcol = "highway")
To visualize in a smaller radius, I’ll use mutate_geocode()
from ggmap to geocode Five Points and will create a 1-mile buffer around it using st_buffer()
from sf.
library(ggmap)
register_google(key = "your_ggmap_key") #https://cran.r-project.org/web/packages/ggmap/readme/README.html
five_points_1mi_rad = as_tibble("Five Points, GA") %>%
as_tibble() %>%
mutate_geocode(value, force = TRUE) %>%
st_as_sf(coords = c("lon", "lat"),crs = 4326) %>% #The output at this step is a point-based sf geometry.
#Change the coordinate system to one feet.
st_transform("+proj=tmerc +lat_0=30 +lon_0=-84.16666666666667 +k=0.9999 +x_0=699999.9999999999
+y_0=0 +datum=NAD83 +units=us-ft +no_defs +ellps=GRS80 +towgs84=0,0,0") %>%
st_buffer(5280) %>%
#Transform back to coordinate system 4326 so it jives with the default coordinate system output of the OpenStreetMap data above.
st_transform(4326)
## Source : https://maps.googleapis.com/maps/api/geocode/json?address=Five+Points,+GA&key=xxx
## "Five Points, GA" not uniquely geocoded, using "five points, atlanta, ga 30303, usa"
Find the intersection of the OSM vector data with this buffer radius using st_intersection()
, and visualize using mapview()
.
library(RColorBrewer) #to get a better palette.
set1_5 = RColorBrewer::brewer.pal(n=5, name = "Set1")
atl_roads %>%
st_intersection(five_points_1mi_rad) %>%
dplyr::select(osm_id, name, HFCS, bicycle, highway) %>% #pick a few variables
mapview(
zcol = "highway",
color =set1_5,
map.types = c("CartoDB.DarkMatter", "CartoDB.Positron", "OpenStreetMap")
)
Let’s now construct a dataset of bicycle infrastructure. We’ll focus on a 1-mile radius around Monroe St NE and Ponce de Leon Ave NE, as there is a diverse mix of bicycle infrastructure in that area.
Define the bounding box using coordinates. The sequence is xmin, ymin, xmax, ymax, where x is longitude and y is latitude.
ponce_monroe_bbox = osmdata::opq(bbox =c(-84.38882296093993, 33.786737, -84.352018, 33.759368705358256))
We’d only expect to find on-road bicycle infrastructure on primary, secondary, tertiary, or residential roadways, so we need not pull in other types like motorways.
pm_primary = ponce_monroe_bbox %>%
add_osm_feature(key = "highway", value = "primary") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
pm_secondary = ponce_monroe_bbox %>%
add_osm_feature(key = "highway", value = "secondary") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
pm_tertiary = ponce_monroe_bbox %>%
add_osm_feature(key = "highway", value = "tertiary") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
pm_residential = ponce_monroe_bbox %>%
add_osm_feature(key = "highway", value = "residential") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
As mentioned, because these are each sf
objects, we can wrangle them using sf and dplyr functions. Use dplyr::select()
to narrow down the relevant variables (keys) and dplyr::mutate()
to add an indicator variable.
pm_roadways = pm_primary %>%
bind_rows(
pm_secondary,
pm_tertiary,
pm_residential
) %>%
#narrow down the variable names (keys). Soft code some of them in case they're not there.
dplyr::select(
osm_id, name, contains("HFCS"),
starts_with("highway"), #there is sometimes a key called highway_1, which this grabs.
contains("surface"), contains("sidewalk"),
contains("foot"),
contains("cycle"), contains("bicycle"),
contains("segregated")
) %>%
#make an indicator variable for this feature collection
mutate(
motor_or_nah = "motor"
)
pm_roadways %>%
mapview(
zcol = "highway",
color =set1_5,
map.types = c("CartoDB.DarkMatter", "CartoDB.Positron", "OpenStreetMap")
)
Some observations:
highway
key called highway_1
. Its value is cycleway
, referring to the protected bike lane along that corridor. It’s not clear if one would have enough information on this alone to know that this is a protected bike lane and not a conventional or buffered bike lane, though.cycleway = lane
), but there is no note about it being buffered.Explore the cycleway
and bicycle
keys.
table(pm_roadways$cycleway)
##
## lane no shared_lane
## 18 1 26
table(pm_roadways$cycleway.left)
##
## lane
## 3
table(pm_roadways$cycleway.right)
##
## lane shared_lane
## 2 9
table(pm_roadways$cycleway.both)
##
## no
## 2
table(pm_roadways$highway_1)
##
## cycleway
## 15
table(pm_roadways$bicycle)
##
## designated yes
## 69 2
Create a new variable indicating the type of on-street infrastructure per OSM.
pm_roadways_wrangle = pm_roadways %>%
mutate(
bike_infra = case_when(
#---Protected lanes----#
highway_1 == "cycleway" ~ "protected lane",
#---Lanes (conventional or buffered)-------#
cycleway == "lane" ~ "lane of unknown type",
cycleway.right == "lane" ~ "lane of unknown type",
cycleway.left == "lane" ~ "lane of unknown type",
#----Sharrow-------------#
cycleway == "shared_lane" ~ "sharrow",
cycleway.right == "shared_lane" ~ "sharrow",
cycleway.left == "shared_lane" ~ "sharrow"
)
)
dark2_3 = RColorBrewer::brewer.pal(n=3, name = "Dark2")
pm_roadways_wrangle %>%
filter(is.na(bike_infra)==FALSE) %>%
mapview(zcol = "bike_infra", color = dark2_3)
Grab the generic paths, cycleways, and footways.
pm_path = ponce_monroe_bbox %>%
add_osm_feature(key = "highway", value = "path") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
pm_cycleway = ponce_monroe_bbox %>%
add_osm_feature(key = "highway", value = "cycleway") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
pm_footway = ponce_monroe_bbox %>%
add_osm_feature(key = "highway", value = "footway") %>%
osmdata_sf() %>%
.$osm_lines %>%
st_as_sf()
Bind them together and visualize.
pm_non_motor = pm_path %>%
bind_rows(
pm_cycleway, pm_footway
) %>%
#narrow down the variable names again.
dplyr::select(
osm_id, name, contains("HFCS"),
starts_with("highway"), #There is sometimes a key called highway_1, which this picks up.
contains("surface"), contains("sidewalk"),
contains("cycle"), contains("bicycle"),
contains("segregated")
) %>%
#make an indicator variable for this feature collection
mutate(
motor_or_nah = "nah"
)
dark2_3 = RColorBrewer::brewer.pal(n=3, name = "Dark2")
pm_non_motor %>%
mapview(
zcol = "highway",
color =dark2_3,
map.types = c("CartoDB.Positron", "CartoDB.DarkMatter", "OpenStreetMap")
)
Some additional observations:
osm_id=741964056
) is coded as a cycleway, in addition to the highway_1=cycleway
noted above. That is, the same roadway has two geometries - one for the roadway itself and one for the protected bike lane paralleling it. As we notedosm_id=169957934
in Freedom Park is coded as a path, but is adjacent to very similar paved trails coded as cycleways.Consolidate the off-road infrastructure so it can be visualized together with the other on-road infrastructure.
pm_non_motor_wrangle = pm_non_motor %>%
mutate(
bike_infra =
case_when(
osm_id == 741964056 ~ "protected lane",
name == "Portman PATH" ~ "protected lane", #using some local knowledge
name == "Peachtree Center Cycle Track" ~ "protected lane",
name == "Stone Mountain Trail" ~ "protected lane", #probably would have been picked up with softer coding..
#otherwise...
highway == "cycleway" ~"paved trail",
highway == "path" & surface == "paved" ~ "paved trail"
)
)
pm_non_motor_wrangle %>%
filter(is.na(bike_infra)==FALSE) %>%
mapview(
lwd=4,
zcol = "bike_infra" ,color = c("purple", "green"))
Combine the on-street infrastructure with the off-street infrastructure.
dark2_5 = RColorBrewer::brewer.pal(n=4, name = "Dark2")
pm_non_motor_wrangle %>%
bind_rows(pm_roadways_wrangle) %>%
filter(is.na(bike_infra)==FALSE) %>%
mapview(
zcol = "bike_infra",
layer.name = "Bike infrastructure", #Label this once since it's the last one.
color = dark2_5,
lwd=4,
map.types = c("CartoDB.Positron", "CartoDB.DarkMatter", "OpenStreetMap")
)
For further reading, I’d suggest the article by Ferster & colleagues (2020):
Using OpenStreetMap to inventory bicycle infrastructure: A comparison with open data from cities.
Copyright © 2022 Michael D. Garber