Dagster setup that scrapes GTFS and GTFS-RT for specified transit agencies and adds them to a DuckDB
Find a file
2025-11-07 07:41:30 -08:00
config rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
user_code rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
.gitignore rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
dagster.yaml rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
docker-compose.yaml rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
Dockerfile_dagster initial dagster setup 2025-11-05 17:24:58 -08:00
Dockerfile_dagster_code rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
env.sample added mobilitydb token to env.sample 2025-11-06 12:42:15 -08:00
LICENSE Initial commit 2025-11-05 18:22:18 -06:00
README.md edited README 2025-11-07 07:41:30 -08:00
workspace.yaml rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00

gtfs-dagster

Dagster setup that scrapes GTFS and GTFS-RT for specified transit agencies and adds them to a DuckDB

Input

This reads from the config/agency_list.csv file, edit this file to include the transit agencies that you want to scrape, add the relevant IDs from mobilitydatabase.org

set your environment

Data directory

edit dagster.yaml to specify the correct data directory under run_launcher.

Right now it says: /home/ben/code/gtfs-dagster/data:/opt/dagster/app/data, change the first part to where you want the data to be stored.

.env file

copy env.sample to .env and change the database password and the mobility database refresh token

Run it

docker compose build docker compose up -d access the Dagster web ui at 127.0.0.1:3001