# gtfs-dagster Dagster setup that scrapes GTFS and GTFS-RT for specified transit agencies and adds them to a DuckDB ## Quick start 1. Edit the .env file. copy `env.sample` to `.env` and change: - Postgres database password - make it something random before the first run - MobilityDatabase.org API token - Location of `data`, `config`, and `postgres_data` directories (default is in working directory). `config` is part of the repo as it comes with sample configuration files. 2. Edit `config/agency_list.csv` - See `config/agency_list.csv.sample` for an example. - Define which agencies and feeds to scrape with the file. - To include the transit agencies that you want to scrape, add the relevant Feed IDs from mobilitydatabase.org 3. Build the docker containers `docker compose build` 4. Run the docker containers `docker compose up -d` 5. Access the Dagster web ui at 127.0.0.1:3001 6. Materialize the first asset: `agency_list` ## To-do: 1. Change mobilitydata from using the API with a key, to using the csv on their GitHub page. 2. Load data into duckdb 3. Transform data in duckdb 4. Analyze data