j40-cejst-2

mirror of https://github.com/DOI-DO/j40-cejst-2.git synced 2025-07-28 13:51:16 -07:00

Author	SHA1	Message	Date
Jorge Escobar	e8e951fe9a	Rezip CSV and Excel with Codebook (#1971 ) * Rezip CSV and Excel files with Codebook * codebook version * packages fix * pydantic * lint * Remove markdown link from markdown checker (#1936) Co-authored-by: Vim <86254807+vim-usds@users.noreply.github.com>	2022-10-04 15:45:09 -04:00
Jorge Escobar	1c448a77f9	NRI dataset and initial score YAML configuration (#1534 ) * update be staging gha * NRI dataset and initial score YAML configuration * checkpoint * adding data checks for release branch * passing tests * adding INPUT_EXTRACTED_FILE_NAME to base class * lint * columns to keep and tests * update be staging gha * checkpoint * update be staging gha * NRI dataset and initial score YAML configuration * checkpoint * adding data checks for release branch * passing tests * adding INPUT_EXTRACTED_FILE_NAME to base class * lint * columns to keep and tests * checkpoint * PR Review * renoving source url * tests * stop execution of ETL if there's a YAML schema issue * update be staging gha * adding source url as class var again * clean up * force cache bust * gha cache bust * dynamically set score vars from YAML * docsctrings * removing last updated year - optional reverse percentile * passing tests * sort order * column ordening * PR review * class level vars * Updating DatasetsConfig * fix pylint errors * moving metadata hint back to code Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2022-08-09 16:37:10 -04:00
Emma Nechamkin	2279a04c94	Quick fix: updating snapshots to have more sigfigs (#1409 ) Updated snapshots to include 10 digits after the decimal	2022-03-14 21:44:35 -04:00
Emma Nechamkin	9d920d4db4	Updating testing to include pytest-snapshot (#1355 ) In this commit, we slightly change the testing to use `pytest-snapshot`. This is for `ETL`s only.	2022-03-11 21:34:07 -05:00
Emma Nechamkin	1b76a68838	FEMA data check (#1270 ) we wanted to implement a slightly different FEMA AG LOSS indicator. Here, we take the 90th percentile only of tracts that have agvalue, and then we also floor the denominator of the rate calculation (loss/total value) at $408k	2022-02-17 16:53:04 -05:00
Lucas Merrill Brown	3e37d9d1a3	Issue 1075: update snapshots using command-line flag (#1249 ) * Adding skippable tests using command-line flag	2022-02-14 12:16:52 -05:00
Lucas Merrill Brown	a0d6e55f0a	Run ETL processes in parallel (#1253 ) * WIP on parallelizing * switching to get_tmp_path for nri * switching to get_tmp_path everywhere necessary * fixing linter errors * moving heavy ETLs to front of line * add hold * moving cdc places up * removing unnecessary print * moving h&t up * adding parallel to geo post * better census labels * switching to concurrent futures * fixing output	2022-02-11 14:04:53 -05:00
Lucas Merrill Brown	43e005cc10	Issue 1075: Add refactored ETL tests to NRI (#1088 ) * Adds a substantially refactored ETL test to the National Risk Index, to be used as a model for other tests	2022-02-08 19:05:32 -05:00
Saran Ahluwalia	fdba1eb171	Revisions to FEMA measure and new link for FEMA data (#952 ) * per tract collect all diaster total annual expected loss - numerator * add updated numerators * EALP columns are missing on tox check - this will ensure only EALP columns that exist are subet on * EALB columns are missing on tox check - this will ensure only EALP columns that exist are subet on * reverted to incorporate megatracts * updated unit tests * fix tests * add transform * remove print statement * input reflects input from FEMA risks for tracts * revise tests and update fixtures - clean up tests and main transform function * added more records * remove references to Blocks in keyword args in tests * linting * addressed latest PR feedback * remove imports and update arguments to be compatible for 1.1.0 * remove block reference in test * change precision to 10 digits - refactor tests to accomdate this Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>	2021-12-03 12:42:07 -05:00
Lucas Merrill Brown	537844236a	Update FEMA data to be tracts, not block groups (#906 )	2021-11-30 13:49:20 -05:00
Lucas Merrill Brown	21834b4a91	Issue 883: Update FEMA risk index measure (#884 ) * ETL updated * Adding three fields to score	2021-11-13 11:32:15 -05:00
Lucas Merrill Brown	03e59f2abd	Definition L updates (#862 ) * Changing FEMA risk measure * Adding "basic stats" feature to comparison tool * Tweaking Definition L	2021-11-05 15:43:52 -04:00
Billy Daly	d1273b63c5	Add ETL Contract Checks (#619 ) * Adds dev dependencies to requirements.txt and re-runs black on codebase * Adds test and code for national risk index etl, still in progress * Removes test_data from .gitignore * Adds test data to nation_risk_index tests * Creates tests and ETL class for NRI data * Adds tests for load() and transform() methods of NationalRiskIndexETL * Updates README.md with info about the NRI dataset * Adds to dos * Moves tests and test data into a tests/ dir in national_risk_index * Moves tmp_dir for tests into data/tmp/tests/ * Promotes fixtures to conftest and relocates national_risk_index tests: The relocation of national_risk_index tests is necessary because tests can only use fixtures specified in conftests within the same package * Fixes issue with df.equals() in test_transform() * Files reformatted by black * Commit changes to other files after re-running black * Fixes unused import that caused lint checks to fail * Moves tests/ directory to app root for data_pipeline * Adds new methods to ExtractTransformLoad base class: - __init__() Initializes class attributes - _get_census_fips_codes() Loads a dataframe with the fips codes for census block group and tract - validate_init() Checks that the class was initialized correctly - validate_output() Checks that the output was loaded correctly * Adds test for ExtractTransformLoad.__init__() and base.py * Fixes failing flake8 test * Changes geo_col to geoid_col and changes is_dataset to is_census in yaml * Adds test for validate_output() * Adds remaining tests * Removes is_dataset from init method * Makes CENSUS_CSV a class attribute instead of a class global: This ensures that CENSUS_CSV is only set when the ETL class is for a non-census dataset and removes the need to overwrite the value in mock_etl fixture * Re-formats files with black and fixes broken tox tests	2021-10-13 15:54:15 -04:00
Shelby Switzer	d3a18352fc	Add pytest to tox run in CI/CD (#713 ) * Add pytest to tox run in CI/CD * Try fixing tox dependencies for pytest * update poetry to get ci/cd passing * Run poetry export with --dev flag to include dev dependencies such as pytest * WIP updating test fixtures to include PDF * Remove dev dependencies from reqs and add pytest to envlist to make build faster * passing score_post tests * Add pytest tox (#729) * Fix failing pytest * Fixes failing tox tests and updates requirements.txt to include dev deps * pickle protocol 4 Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov> Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov> Co-authored-by: Billy Daly <williamdaly422@gmail.com> Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>	2021-09-22 13:47:37 -04:00
Billy Daly	f0900f7b69	Adds National Risk Index data to ETL pipeline (#549 ) * Adds dev dependencies to requirements.txt and re-runs black on codebase * Adds test and code for national risk index etl, still in progress * Removes test_data from .gitignore * Adds test data to nation_risk_index tests * Creates tests and ETL class for NRI data * Adds tests for load() and transform() methods of NationalRiskIndexETL * Updates README.md with info about the NRI dataset * Adds to dos * Moves tests and test data into a tests/ dir in national_risk_index * Moves tmp_dir for tests into data/tmp/tests/ * Promotes fixtures to conftest and relocates national_risk_index tests: The relocation of national_risk_index tests is necessary because tests can only use fixtures specified in conftests within the same package * Fixes issue with df.equals() in test_transform() * Files reformatted by black * Commit changes to other files after re-running black * Fixes unused import that caused lint checks to fail * Moves tests/ directory to app root for data_pipeline	2021-09-07 20:51:34 -04:00

15 commits