Dagster setup that scrapes GTFS and GTFS-RT for specified transit agencies and adds them to a DuckDB
Find a file
2025-11-06 12:43:33 -08:00
data/gtfs added MCTS and WTA to agency_list.csv 2025-11-05 18:33:31 -08:00
user_code added gtfs_feed_metadata 2025-11-06 12:31:17 -08:00
.gitignore added asset to read agency_list.csv and add it to table in gtfs.duckdb 2025-11-05 18:22:26 -08:00
dagster.yaml added gtfs_feed_metadata 2025-11-06 12:31:17 -08:00
docker-compose.yaml added gtfs_feed_metadata 2025-11-06 12:31:17 -08:00
Dockerfile_dagster initial dagster setup 2025-11-05 17:24:58 -08:00
Dockerfile_user_code_gtfs added gtfs_feed_metadata 2025-11-06 12:31:17 -08:00
env.sample added mobilitydb token to env.sample 2025-11-06 12:42:15 -08:00
LICENSE Initial commit 2025-11-05 18:22:18 -06:00
README.md edited README 2025-11-06 12:43:33 -08:00
workspace.yaml added asset to read agency_list.csv and add it to table in gtfs.duckdb 2025-11-05 18:22:26 -08:00

gtfs-dagster

Dagster setup that scrapes GTFS and GTFS-RT for specified transit agencies and adds them to a DuckDB

Input

This reads from the data/gtfs/agency_list.csv file, edit this file to include the transit agencies that you want to scrape, add the relevant IDs from mobilitydatabase.org

set your environment

Data directory

edit dagster.yaml to specify the correct data directory under run_launcher.

Right now it says: /home/ben/code/gtfs-dagster/data:/opt/dagster/app/data, change the first part to where you want the data to be stored.

.env file

copy env.sample to .env and change the database password and the mobility database refresh token

Run it

docker compose build docker compose up -d access the Dagster web ui at 127.0.0.1:3001