edited sensors to account for a fresh start, edited README

2025-12-07 09:27:41 -07:00 · 2025-12-07 09:27:41 -07:00 · e068cbed20
commit e068cbed20
parent b2571a8a48
4 changed files with 124 additions and 78 deletions
--- a/README.md
+++ b/README.md
@ -2,24 +2,31 @@

 Dagster setup that scrapes GTFS and GTFS-RT for specified transit agencies and adds them to a DuckDB

-## Input
-You define which agencies and feeds to scrape with the file`config/agency_list.csv`
+## Quick start

-To include the transit agencies that you want to scrape, add the relevant IDs from mobilitydatabase.org
-
-See `config/agency_list.csv.sample` for an example.
-
-## set your environment
-
-### .env file
+1. Edit the .env file.
 copy `env.sample` to `.env` and change:
- Postgres database password - make it something random before the first run
- MobilityDatabase.org API token
- Location of data, config, and postgres_data directories (default is in working directory)
+  - Postgres database password - make it something random before the first run
+  - MobilityDatabase.org API token
+  - Location of `data`, `config`, and `postgres_data` directories (default is in working directory). `config` is part of the repo as it comes with sample configuration files.

+2. Edit `config/agency_list.csv`
+  - See `config/agency_list.csv.sample` for an example.
+  - Define which agencies and feeds to scrape with the file.
+  - To include the transit agencies that you want to scrape, add the relevant Feed IDs from mobilitydatabase.org

-
-# Run it
+3. Build the docker containers
 `docker compose build`
+
+4. Run the docker containers
 `docker compose up -d`
-access the Dagster web ui at 127.0.0.1:3001
+
+5. Access the Dagster web ui at 127.0.0.1:3001
+
+6. Materialize the first asset: `agency_list`
+
+## To-do:
+1. Change mobilitydata from using the API with a key, to using the csv on their GitHub page.
+2. Load data into duckdb
+3. Transform data in duckdb
+4. Analyze data