* added tribalId for Supplemental dataset (#1804)
* Setting zoom levels for tribal map (#1810)
* NRI dataset and initial score YAML configuration (#1534)
* update be staging gha
* NRI dataset and initial score YAML configuration
* checkpoint
* adding data checks for release branch
* passing tests
* adding INPUT_EXTRACTED_FILE_NAME to base class
* lint
* columns to keep and tests
* update be staging gha
* checkpoint
* update be staging gha
* NRI dataset and initial score YAML configuration
* checkpoint
* adding data checks for release branch
* passing tests
* adding INPUT_EXTRACTED_FILE_NAME to base class
* lint
* columns to keep and tests
* checkpoint
* PR Review
* renoving source url
* tests
* stop execution of ETL if there's a YAML schema issue
* update be staging gha
* adding source url as class var again
* clean up
* force cache bust
* gha cache bust
* dynamically set score vars from YAML
* docsctrings
* removing last updated year - optional reverse percentile
* passing tests
* sort order
* column ordening
* PR review
* class level vars
* Updating DatasetsConfig
* fix pylint errors
* moving metadata hint back to code
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
* Correct copy typo (#1809)
* Add basic test suite for COI (#1518)
* Update COI to use new yaml (#1518)
* Add tests for DOE energy budren (1518
* Add dataset config for energy budren (1518)
* Refactor ETL to use datasets.yml (#1518)
* Add fake GEOIDs to COI tests (#1518)
* Refactor _setup_etl_instance_and_run_extract to base (#1518)
For the three classes we've done so far, a generic
_setup_etl_instance_and_run_extract will work fine, for the moment we
can reuse the same setup method until we decide future classes need more
flexibility --- but they can also always subclass so...
* Add output-path tests (#1518)
* Update YAML to match constant (#1518)
* Don't blindly set float format (#1518)
* Add defaults for extract (#1518)
* Run YAML load on all subclasses (#1518)
* Update description fields (#1518)
* Update YAML per final format (#1518)
* Update fixture tract IDs (#1518)
* Update base class refactor (#1518)
Now that NRI is final I needed to make a small number of updates to my
refactored code.
* Remove old comment (#1518)
* Fix type signature and return (#1518)
* Update per code review (#1518)
Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
Co-authored-by: Vim <86254807+vim-usds@users.noreply.github.com>
* update be staging gha
* NRI dataset and initial score YAML configuration
* checkpoint
* adding data checks for release branch
* passing tests
* adding INPUT_EXTRACTED_FILE_NAME to base class
* lint
* columns to keep and tests
* update be staging gha
* checkpoint
* update be staging gha
* NRI dataset and initial score YAML configuration
* checkpoint
* adding data checks for release branch
* passing tests
* adding INPUT_EXTRACTED_FILE_NAME to base class
* lint
* columns to keep and tests
* checkpoint
* PR Review
* renoving source url
* tests
* stop execution of ETL if there's a YAML schema issue
* update be staging gha
* adding source url as class var again
* clean up
* force cache bust
* gha cache bust
* dynamically set score vars from YAML
* docsctrings
* removing last updated year - optional reverse percentile
* passing tests
* sort order
* column ordening
* PR review
* class level vars
* Updating DatasetsConfig
* fix pylint errors
* moving metadata hint back to code
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
we wanted to implement a slightly different FEMA AG LOSS indicator. Here, we take the 90th percentile only of tracts that have agvalue, and then we also floor the denominator of the rate calculation (loss/total value) at $408k
* WIP on parallelizing
* switching to get_tmp_path for nri
* switching to get_tmp_path everywhere necessary
* fixing linter errors
* moving heavy ETLs to front of line
* add hold
* moving cdc places up
* removing unnecessary print
* moving h&t up
* adding parallel to geo post
* better census labels
* switching to concurrent futures
* fixing output
* per tract collect all diaster total annual expected loss - numerator
* add updated numerators
* EALP columns are missing on tox check - this will ensure only EALP columns that exist are subet on
* EALB columns are missing on tox check - this will ensure only EALP columns that exist are subet on
* reverted to incorporate megatracts
* updated unit tests
* fix tests
* add transform
* remove print statement
* input reflects input from FEMA risks for tracts
* revise tests and update fixtures - clean up tests and main transform function
* added more records
* remove references to Blocks in keyword args in tests
* linting
* addressed latest PR feedback
* remove imports and update arguments to be compatible for 1.1.0
* remove block reference in test
* change precision to 10 digits - refactor tests to accomdate this
Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>
* Adds dev dependencies to requirements.txt and re-runs black on codebase
* Adds test and code for national risk index etl, still in progress
* Removes test_data from .gitignore
* Adds test data to nation_risk_index tests
* Creates tests and ETL class for NRI data
* Adds tests for load() and transform() methods of NationalRiskIndexETL
* Updates README.md with info about the NRI dataset
* Adds to dos
* Moves tests and test data into a tests/ dir in national_risk_index
* Moves tmp_dir for tests into data/tmp/tests/
* Promotes fixtures to conftest and relocates national_risk_index tests:
The relocation of national_risk_index tests is necessary because tests
can only use fixtures specified in conftests within the same package
* Fixes issue with df.equals() in test_transform()
* Files reformatted by black
* Commit changes to other files after re-running black
* Fixes unused import that caused lint checks to fail
* Moves tests/ directory to app root for data_pipeline