Dagster setup that scrapes GTFS and GTFS-RT for specified transit agencies and adds them to a DuckDB
Find a file
2025-11-07 08:43:48 -08:00
config rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
user_code rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
.gitignore rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
dagster.yaml made data and config directories dynamic, switched from DockerRunLauncher to DefaultRunLauncher 2025-11-07 08:38:30 -08:00
docker-compose.yaml made data and config directories dynamic, switched from DockerRunLauncher to DefaultRunLauncher 2025-11-07 08:38:30 -08:00
Dockerfile_dagster initial dagster setup 2025-11-05 17:24:58 -08:00
Dockerfile_dagster_code rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00
env.sample edited README and env.sample 2025-11-07 08:43:48 -08:00
LICENSE Initial commit 2025-11-05 18:22:18 -06:00
README.md edited README and env.sample 2025-11-07 08:43:48 -08:00
workspace.yaml rearranged directory structure, automaterialize to automationCondition 2025-11-07 07:40:46 -08:00

gtfs-dagster

Dagster setup that scrapes GTFS and GTFS-RT for specified transit agencies and adds them to a DuckDB

Input

This reads from the config/agency_list.csv file, edit this file to include the transit agencies that you want to scrape, add the relevant IDs from mobilitydatabase.org

set your environment

.env file

copy env.sample to .env and change:

  • Postgres database password - make it something random before the first run
  • MobilityDatabase.org API token
  • Location of data, config, and postgres_data directories (default is in working directory)

Run it

docker compose build docker compose up -d access the Dagster web ui at 127.0.0.1:3001