NRI dataset and initial score YAML configuration (#1534)

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* update be staging gha

* checkpoint

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* checkpoint

* PR Review

* renoving source url

* tests

* stop execution of ETL if there's a YAML schema issue

* update be staging gha

* adding source url as class var again

* clean up

* force cache bust

* gha cache bust

* dynamically set score vars from YAML

* docsctrings

* removing last updated year - optional reverse percentile

* passing tests

* sort order

* column ordening

* PR review

* class level vars

* Updating DatasetsConfig

* fix pylint errors

* moving metadata hint back to code

Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
This commit is contained in:
Jorge Escobar 2022-08-09 16:37:10 -04:00 committed by GitHub
commit 1c448a77f9
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
15 changed files with 272 additions and 3485 deletions

View file

@ -87,11 +87,6 @@ class TestNationalRiskIndexETL(TestETL):
assert etl.GEOID_FIELD_NAME == "GEOID10"
assert etl.GEOID_TRACT_FIELD_NAME == "GEOID10_TRACT"
assert etl.NAME == "national_risk_index"
assert etl.LAST_UPDATED_YEAR == 2020
assert (
etl.SOURCE_URL
== "https://hazards.fema.gov/nri/Content/StaticDocuments/DataDownload//NRI_Table_CensusTracts/NRI_Table_CensusTracts.zip"
)
assert etl.GEO_LEVEL == ValidGeoLevel.CENSUS_TRACT
assert etl.COLUMNS_TO_KEEP == [
etl.GEOID_TRACT_FIELD_NAME,
@ -109,6 +104,6 @@ class TestNationalRiskIndexETL(TestETL):
output_file_path = etl._get_output_file_path()
expected_output_file_path = (
data_path / "dataset" / "national_risk_index_2020" / "usa.csv"
data_path / "dataset" / "national_risk_index" / "usa.csv"
)
assert output_file_path == expected_output_file_path