Refactor DOE Energy Burden and COI to use YAML (#1796)

* added tribalId for Supplemental dataset (#1804)

* Setting zoom levels for tribal map (#1810)

* NRI dataset and initial score YAML configuration (#1534)

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* update be staging gha

* checkpoint

* update be staging gha

* NRI dataset and initial score YAML configuration

* checkpoint

* adding data checks for release branch

* passing tests

* adding INPUT_EXTRACTED_FILE_NAME to base class

* lint

* columns to keep and tests

* checkpoint

* PR Review

* renoving source url

* tests

* stop execution of ETL if there's a YAML schema issue

* update be staging gha

* adding source url as class var again

* clean up

* force cache bust

* gha cache bust

* dynamically set score vars from YAML

* docsctrings

* removing last updated year - optional reverse percentile

* passing tests

* sort order

* column ordening

* PR review

* class level vars

* Updating DatasetsConfig

* fix pylint errors

* moving metadata hint back to code

Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>

* Correct copy typo (#1809)

* Add basic test suite for COI (#1518)

* Update COI to use new yaml (#1518)

* Add tests for DOE energy budren (1518

* Add dataset config for energy budren (1518)

* Refactor ETL to use datasets.yml (#1518)

* Add fake GEOIDs to COI tests (#1518)

* Refactor _setup_etl_instance_and_run_extract to base (#1518)

For the three classes we've done so far, a generic
_setup_etl_instance_and_run_extract will work fine, for the moment we
can reuse the same setup method until we decide future classes need more
flexibility --- but they can also always subclass so...

* Add output-path tests (#1518)

* Update YAML to match constant (#1518)

* Don't blindly set float format (#1518)

* Add defaults for extract (#1518)

* Run YAML load on all subclasses (#1518)

* Update description fields (#1518)

* Update YAML per final format (#1518)

* Update fixture tract IDs (#1518)

* Update base class refactor (#1518)

Now that NRI is final I needed to make a small number of updates to my
refactored code.

* Remove old comment (#1518)

* Fix type signature and return (#1518)

* Update per code review (#1518)

Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
Co-authored-by: Vim <86254807+vim-usds@users.noreply.github.com>
This commit is contained in:
Matt Bowen 2022-08-10 16:02:59 -04:00 committed by Emma Nechamkin
commit 97e17546cc
28 changed files with 455 additions and 189 deletions

View file

@ -77,3 +77,55 @@ datasets:
df_field_name: "CONTAINS_AGRIVALUE"
long_name: "Contains agricultural value"
field_type: bool
- long_name: "Child Opportunity Index 2.0 database"
short_name: "coi"
module_name: "child_opportunity_index"
input_geoid_tract_field_name: "geoid"
load_fields:
- short_name: "he_heat"
df_field_name: "EXTREME_HEAT_FIELD"
long_name: "Summer days above 90F"
field_type: float
include_in_downloadable_files: true
include_in_tiles: true
- short_name: "he_food"
long_name: "Percent low access to healthy food"
df_field_name: "HEALTHY_FOOD_FIELD"
field_type: float
include_in_downloadable_files: true
include_in_tiles: true
- short_name: "he_green"
long_name: "Percent impenetrable surface areas"
df_field_name: "IMPENETRABLE_SURFACES_FIELD"
field_type: float
include_in_downloadable_files: true
include_in_tiles: true
- short_name: "ed_reading"
df_field_name: "READING_FIELD"
long_name: "Third grade reading proficiency"
field_type: float
include_in_downloadable_files: true
include_in_tiles: true
- long_name: "Low-Income Energy Affordabililty Data"
short_name: "LEAD"
module_name: "doe_energy_burden"
input_geoid_tract_field_name: "FIP"
load_fields:
- short_name: "EBP_PFS"
df_field_name: "REVISED_ENERGY_BURDEN_FIELD_NAME"
long_name: "Energy burden"
field_type: float
include_in_downloadable_files: true
include_in_tiles: true
- long_name: "Example ETL"
short_name: "Example"
module_name: "example_dataset"
input_geoid_tract_field_name: "GEOID10_TRACT"
load_fields:
- short_name: "EXAMPLE_FIELD"
df_field_name: "Input Field 1"
long_name: "Example Field 1"
field_type: float
include_in_tiles: true
include_in_downloadable_files: true