mirror of
https://github.com/DOI-DO/j40-cejst-2.git
synced 2025-08-14 22:41:40 -07:00
Initial refactor for Score ETL (#618)
* WIP refactor * Exract score calculations into their own methods * do all initial df prep in single method * Fix error in docs for running etl for single dataset * WIP understanding HUD and linguistic iso data * Add comments from initial group review on PR Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
This commit is contained in:
parent
470c474367
commit
ac62933d16
4 changed files with 200 additions and 141 deletions
|
@ -94,7 +94,7 @@ TODO add mermaid diagram
|
|||
3. Each ETL script will extract the data from its original source, then format the data into `.csv` files that get stored in the relevant folder in `data_pipeline/data/dataset/`. For example, HUD Housing data is stored in `data_pipeline/data/dataset/hud_housing/usa.csv`
|
||||
|
||||
_**NOTE:** You have the option to pass the name of a specific data source to the `etl-run` command using the `-d` flag, which will limit the execution of the ETL process to that specific data source._
|
||||
_For example: `poetry run etl -- -d ejscreen` would only run the ETL process for EJSCREEN data._
|
||||
_For example: `poetry run etl -d ejscreen` would only run the ETL process for EJSCREEN data._
|
||||
|
||||
#### Step 3: Calculate the Justice40 score experiments
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue