Update documentation to make it easier for users to find the right content for them (#1016)

* First pass of updating documentation for new users

Trying to look at this from the perspective of someone new to the
project, and create some pathways to make it easier for people to get to
the content they are looking for.

* Make it clear that docker is doing the setup

* Link installation again from the main README

* Add some docs about the github actions

* Add markdown link check

* Move git installation first

* Add config for markdown link checker

* Fix some links

* Correct handling of repo root relative links

* Fix broken links in data roadmap

* Fix more broken links

* Fix more links

* Ignore link that's returning a 403 to the checker

It actually works if you go in a browser.

* Fix another broken link

* Ignore more urls that don't work

* Update the readme under docs

* Add some more dataset links

* More strongly call out the quickstart

* Try to call out even more the quickstart link

* Fix dead links

* Add note about initialization time

* Remove broken link from spanish install guide

These will be updated later with a full translation
This commit is contained in:
Shaun Verch 2021-12-16 10:16:28 -05:00 committed by GitHub
commit d90e028c1b
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
17 changed files with 154 additions and 108 deletions

View file

@ -120,7 +120,7 @@ _For example: `poetry run python3 data_pipeline/application.py etl-run -d ejscre
1. These data sets are merged into a single dataframe using their Census Block Group GEOID as a common key, and the data in each of the columns is standardized in two ways:
- Their [percentile rank](https://en.wikipedia.org/wiki/Percentile_rank) is calculated, which tells us what percentage of other Census Block Groups have a lower value for that particular column.
- They are normalized using [min-max normalization](https://en.wikipedia.org/wiki/Feature_scaling), which adjusts the scale of the data so that the Census Block Group with the highest value for that column is set to 1, the Census Block Group with the lowest value is set to 0, and all of the other values are adjusted to fit within that range based on how close they were to the highest or lowest value.
1. The standardized columns are then used to calculate each of the Justice40 score experiments described in greater detail below, and the results are exported to a `.csv` file in [`data/score/csv`](data/score/csv)
1. The standardized columns are then used to calculate each of the Justice40 score experiments described in greater detail below, and the results are exported to a `.csv` file in [`data_pipeline/data/score/csv`](data_pipeline/data/score/csv)
#### Step 4: Compare the Justice40 score experiments to other indices
@ -143,13 +143,13 @@ _NOTE:_ This may take several minutes or over an hour to fully execute and gener
### Data Sources
- **[EJSCREEN](etl/sources/ejscreen):** TODO Add description of data source
- **[Census](etl/sources/census):** TODO Add description of data source
- **[American Communities Survey](etl/sources/census_acs):** TODO Add description of data source
- **[Housing and Transportation](etl/sources/housing_and_transportation):** TODO Add description of data source
- **[HUD Housing](etl/sources/hud_housing):** TODO Add description of data source
- **[HUD Recap](etl/sources/hud_recap):** TODO Add description of data source
- **[CalEnviroScreen](etl/scores/calenviroscreen):** TODO Add description of data source
- **[EJSCREEN](data_pipeline/etl/sources/ejscreen):** TODO Add description of data source
- **[Census](data_pipeline/etl/sources/census):** TODO Add description of data source
- **[American Communities Survey](data_pipeline/etl/sources/census_acs):** TODO Add description of data source
- **[Housing and Transportation](data_pipeline/etl/sources/housing_and_transportation):** TODO Add description of data source
- **[HUD Housing](data_pipeline/etl/sources/hud_housing):** TODO Add description of data source
- **[HUD Recap](data_pipeline/etl/sources/hud_recap):** TODO Add description of data source
- **[CalEnviroScreen](data_pipeline/etl/sources/calenviroscreen):** TODO Add description of data source
## Running using Docker