route_analysis/WI-schools-cycle.Rmd

293 lines
9.2 KiB
Plaintext
Raw Normal View History

---
title: "Wisconsin Cycling to School"
output:
html_document:
toc: true
toc_depth: 5
toc_float:
collapsed: false
smooth_scroll: true
editor_options:
chunk_output_type: console
---
```{r preCode, eval = TRUE, echo = FALSE, results = "show", warning = FALSE, error = TRUE, message = FALSE}
date()
```
# Input Data & Configuration
## Libraries
```{r libs, eval = TRUE, echo = FALSE, results = "show", warning = FALSE, error = TRUE, message = FALSE}
rm(list=ls())
library(tidyverse)
library(ggmap)
library(sf)
library(osrm)
library(reactable)
2024-11-27 09:45:47 -06:00
library(smoothr)
library(httr)
fig.height <- 6
set.seed(1)
source("./R/functions.R")
2024-11-21 17:10:46 -06:00
runLoop <- FALSE
```
## Configuration
Set configuration parameters for OSRM, brouter, and stadiamaps.
```{r config, eval = TRUE, echo = FALSE, results = "show", warning = TRUE, error = TRUE, message = TRUE}
options(osrm.server = "http://127.0.0.1:5001/")
options(osrm.profile = "bike")
brouter_url <- "http://127.0.0.1:17777/brouter"
brouter_profile <- "safety"
register_stadiamaps(key = substr(read_file(file = "api_keys/stadia_api_key"), 1, 36))
WI_schools <- st_transform(st_read(dsn = "data/Schools/Wisconsin_Public_Schools_-5986231931870160084.gpkg", quiet = TRUE), crs = 4326)
WI_schools <- WI_schools %>% mutate(geom = SHAPE)
```
# Analysis
We focus on the statistic *non-cycleway duration* in this analysis. It
is computed as the duration (in minutes) of the bike trip to school
(brouter, safety) for each grid cell in the school's bikeable
area. A bikeable area is defined as the region within 3 miles of
school by bike (OSRM).
## Subset Schools
```{r subsetSchools, eval = TRUE, echo = TRUE, results = "show", warning = TRUE, error = TRUE, message = TRUE}
WI_schools <- subset(WI_schools, !is.na(LAT) & !is.na(LON) & GRADE_RANGE == "09-12")
```
We keep only schools with coordinates (non-virtual) and, for
simplicity and efficiency of the initial analysis, we keep only
schools with grades 9-12.
## Loop through WI Schools
For each school we compute the grid and the routes sf objects and save
them as lists as R data files, _gridList.rds_ and
_routesList.rds_. These will then be analyzed downstream and this loop
need not be run again. It took around 40 minutes to run. The code here
is suppressed because it is long and ugly.
```{r mainloop, eval = runLoop, echo = FALSE, results = "show", warning = FALSE, error = TRUE, message = TRUE}
radius <- 3 # miles
levels <- c(1)
res <- 100
threshold <- units::set_units(1, km^2)
gridList <- list()
routesList <- list()
indexVec <- 1:nrow(WI_schools)
2024-11-21 15:13:37 -06:00
jj <- 1;
bad.school.vec <- c()
for(j in indexVec){
school_location <- WI_schools[j,]
cycle_boundary_m <- radius*1609
cycle_boundary_poly <- osrmIsodistance( loc = school_location, breaks = cycle_boundary_m, res = res )
cycle_boundary_poly <- st_make_valid(cycle_boundary_poly)
cycle_boundary_poly <- fill_holes(cycle_boundary_poly, threshold)
cycle_boundary_poly <- st_transform(cycle_boundary_poly, crs = 4326)
cellsize <- 1e-2
grid <- st_make_grid(cycle_boundary_poly, cellsize = cellsize, what = "polygons", square = FALSE)
grid <- st_intersection(cycle_boundary_poly, grid)
grid <- st_make_valid(grid)
grid_pts <- st_centroid(grid)
grid_coods <- st_coordinates(grid_pts)
school_focus_location <- school_location %>% select(LAT, LON)
routes <- list(NULL)
for(i in 1:nrow(grid_coods) ) {
query <- paste0(
brouter_url,
"?lonlats=", grid_coods[i,1], ",",grid_coods[i,2], "|",
school_focus_location$LON, ",", school_focus_location$LAT,
"&profile=", brouter_profile,
"&alternativeidx=0&format=geojson"
)
response <- GET(query)
2024-11-27 09:45:47 -06:00
if( response$status_code == "200" ) {
route_run <- st_read(content <- content(response, as = "text"), quiet = TRUE)
routes[[i]] <- route_run
2024-11-27 09:45:47 -06:00
} else {
routes[[i]] <- NA
}
}
bad.cell <- which(is.na(routes))
2024-11-27 09:45:47 -06:00
if(length(bad.cell) > 0) {
routes <- routes[-bad.cell]
grid <- grid[-bad.cell,]
}
2024-11-27 09:45:47 -06:00
if(length(routes) > 0) {
routes <- st_transform(bind_rows(routes), crs = 4326)
gridList[[jj]] <- grid
routesList[[jj]] <- routes
jj <- jj + 1
2024-11-27 09:45:47 -06:00
} else {
routes <- NA
bad.school.vec <- c(j, bad.school.vec)
cat( WI_schools$SCHOOL[j], "has zero routes to school and has been removed from analysis.\n")
}
}
2024-11-27 09:45:47 -06:00
if(length(bad.school.vec) > 0) {
2024-11-21 15:13:37 -06:00
WI_schools <- WI_schools[-bad.school.vec,]
}
saveRDS(WI_schools, "./R/data/WI_schools.rds")
saveRDS(gridList, "./R/data/gridList.rds")
saveRDS(routesList, "./R/data/routesList.rds")
```
2024-11-21 17:10:46 -06:00
## Read List Data
```{r readLists, eval = TRUE, echo = FALSE, results = "show", warning = TRUE, error = TRUE, message = TRUE}
routesList <- readRDS(file = "./R/data/routesList.rds")
gridList <- readRDS(file = "./R/data/gridList.rds")
2024-11-21 17:10:46 -06:00
WI_schools <- readRDS(file = "./R/data/WI_schools.rds")
not.cycleway.vec <- c()
for(j in 1:length(gridList)){
grid <- gridList[[j]]
routes <- routesList[[j]]
total.time.vec <- routes %>% pull(total.time)
grid <- cbind(grid, total.time = as.numeric(total.time.vec))
x.vec <- c()
for( i in 1:nrow(grid) ){
route <- routes[i,"messages"] # Grid cell i to school j
x <- routeChar(route)
x.vec <- c(x.vec, x)
}
grid <- cbind(grid, T.cycleway = x.vec)
grid <- cbind( grid, not.cycleway = (grid$total.time - grid$T.cycleway)/60)
gridList[[j]] <- grid
not.cycleway.vec <- c(not.cycleway.vec, median(grid$not.cycleway))
}
2024-11-21 17:10:46 -06:00
```
## Plot List Data
### Median Non-Cycleway Duration
2024-11-21 17:10:46 -06:00
#### Histogram
First we investigate the distribution of median non-cycleway duration
across school. Recall that we are considering now only schools grade
9-12.
```{r hist, eval = TRUE, echo = FALSE, results = "show", warning = FALSE, error = TRUE, message = FALSE}
ggplot(data.frame(not.cycleway = not.cycleway.vec), aes(not.cycleway)) + geom_histogram(fill = "orange", color = "black") + theme_bw()
```
Next, we take a look at the schools with the shortest and longest
median time on cycleway. Note that the analysis is peformed across a
gridded area and not with respect to where students live. The median
non-cycleway duration is computed across grid cells, not students.
Note too that this statistics was computed by parsing the *messages*
field of the route returned by brouter. I am not sure if there is a
better way to do this. Within the messages field there is information
on highway type, surface, etc for each segment of the route.
#### Longest
2024-11-21 17:10:46 -06:00
```{r worst, eval = TRUE, echo = FALSE, results = "show", warning = TRUE, error = TRUE, message = FALSE}
2024-11-21 17:10:46 -06:00
register_stadiamaps(key = substr(read_file(file = "api_keys/stadia_api_key"), 1, 36))
k <- 306
2024-11-21 17:10:46 -06:00
zoom.level <- 15
2024-11-21 17:10:46 -06:00
bbox <- st_bbox(st_buffer(gridList[[k]], dist = 500))
bbox <- c(left = as.double(bbox[1]),
bottom = as.double(bbox[2]),
right = as.double(bbox[3]),
top = as.double(bbox[4]))
basemap <- get_stadiamap(bbox = bbox, zoom = zoom.level, maptype = "stamen_toner_lite")
ggmap(basemap) + geom_sf(data = gridList[[k]], aes(fill= not.cycleway), inherit.aes = FALSE) + scale_fill_gradient(low = "yellow", high = "red", limits = c(0,17), na.value = NA)
```
2024-11-21 17:10:46 -06:00
The longest is `r WI_schools[k,] |> pull(SCHOOL)`.
#### Shortest
```{r best, eval = TRUE, echo = FALSE, results = "show", warning = TRUE, error = TRUE, message = FALSE}
k <- 247
2024-11-21 17:10:46 -06:00
bbox <- st_bbox(st_buffer(gridList[[k]], dist = 500))
bbox <- c(left = as.double(bbox[1]),
bottom = as.double(bbox[2]),
right = as.double(bbox[3]),
top = as.double(bbox[4]))
basemap <- get_stadiamap(bbox = bbox, zoom = zoom.level, maptype = "stamen_toner_lite")
ggmap(basemap) + geom_sf(data = gridList[[k]], aes(fill= not.cycleway), inherit.aes = FALSE) + scale_fill_gradient(low = "yellow", high = "red", limits = c(0,17), na.value = NA)
```
The shortest is `r WI_schools[k,] |> pull(SCHOOL)`.
2024-11-21 17:10:46 -06:00
### Statewide Map
```{r plots2, eval = TRUE, echo = FALSE, results = "show", warning = TRUE, error = TRUE, message = FALSE}
2024-11-21 17:10:46 -06:00
D <- cbind(WI_schools, not.cycleway.vec)
zoom.level <- 8
bbox <- st_bbox(st_buffer(D, dist = 10e3))
bbox <- c(left = as.double(bbox[1]),
bottom = as.double(bbox[2]),
right = as.double(bbox[3]),
top = as.double(bbox[4]))
basemap <- get_stadiamap(bbox = bbox, zoom = zoom.level, maptype = "stamen_toner_lite")
ggmap(basemap) + geom_sf(data = D, aes(size = 2, color = not.cycleway.vec), inherit.aes = FALSE) + scale_color_gradient(low = "yellow", high = "red", na.value = NA) # , limits = c(0,17)
```
### Statewide Table
The values shown above can be seen below in this clickable table.
```{r table, eval = TRUE, echo = FALSE, results = "show", warning = TRUE, error = TRUE, message = TRUE}
D_table <- as.data.frame(cbind(WI_schools, not.cycleway.vec))
D <- select(D_table, SCHOOL, COUNTY, NONCYCLEWAY = not.cycleway.vec)
D <- D |> mutate(NONCYCLEWAY = round(NONCYCLEWAY,1))
reactable(D)
```
```{r date, eval = TRUE, echo = FALSE, results = "show", warning = TRUE, error = TRUE, message = TRUE}
date()
```
# Archive
## Session Info
```{r sessionInfo, eval = TRUE, echo = FALSE, results = "show", warning = TRUE, error = TRUE, message = TRUE}
sessionInfo()
```