j40-cejst-2

mirror of https://github.com/DOI-DO/j40-cejst-2.git synced 2025-02-23 10:04:18 -08:00

Author	SHA1	Message	Date
Jorge Escobar	0a21fc6b12	Add territory boundary data (#885 ) * Add territory boundary data * housing and transp * lint * lint * lint	2021-11-16 10:05:09 -05:00
Lucas Merrill Brown	e8d64df510	Fixing missing FEMA fields (#892 )	2021-11-15 11:06:44 -05:00
Lucas Merrill Brown	21834b4a91	Issue 883: Update FEMA risk index measure (#884 ) * ETL updated * Adding three fields to score	2021-11-13 11:32:15 -05:00
Lucas Merrill Brown	05ebf9b48c	Add median house value to Definition L (#882 ) * Added house value to ETL * Adding house value to score formula and comp tool	2021-11-13 10:29:23 -05:00
Vincent La	b0dbc90064	[ISS-723] Load Census Data for 4 Territories (#816 ) * Adding census decennial data for island territories	2021-11-09 16:32:46 -05:00
Jorge Escobar	053dde0d40	Display score L on map (#849 ) * updates to first docker run * tile constants * frontend changes * updating pickles instructions * pickles	2021-11-05 16:26:14 -04:00
Lucas Merrill Brown	03e59f2abd	Definition L updates (#862 ) * Changing FEMA risk measure * Adding "basic stats" feature to comparison tool * Tweaking Definition L	2021-11-05 15:43:52 -04:00
Shelby Switzer	a0bf186ee6	Add percentile column for L (#851 ) * Add percentile column for L * Use Definition instead of Score Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-11-04 13:03:56 -04:00
Lucas Merrill Brown	8372b47d42	Various updates to Definition L (#850 ) * removing percentiles as separate field names * adding RMP	2021-11-04 12:17:45 -04:00
Lucas Merrill Brown	1d541be447	Add EJSCREEN Areas of Concern (#843 ) * Adding ej screen areas of concern * Uses it where user has local files, but not otherwise Co-authored-by: VincentLaUSDS <vincent.la@omb.eop.gov>	2021-11-02 15:38:42 -04:00
Shelby Switzer	7bd1a9e59e	Big ole score refactor (#815 ) * WIP * Create ScoreCalculator This calculates all the factors for score L for now (with placeholder formulae because this is a WIP). I think ideallly we'll want to refactor all the score code to be extracted into this or similar classes. * Add factor logic for score L Updated factor logic to match score L factors methodology. Still need to get the Score L field itself working. Cleanup needed: Pull field names into constants file, extract all score calculation into score calculator * Update thresholds and get score L calc working * Update header name for consistency and update comparison tool * Initial move of score to score calculator * WIP big refactor * Continued WIP on score refactor * WIP score refactor * Get to a working score-run * Refactor to pass df to score init This makes it easier to pass df around within a class with multiple methods that require df. * Updates from Black * Updates from linting * Use named imports instead of wildcard; log more * Additional refactors * move more field names to field_names constants file * import constants without a relative path (would break docker) * run linting * raise error if add_columns is not implemented in a child class * Refactor dict to namedtuple in score c * Update L to use all percentile field * change high school ed field in L back Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-11-02 14:12:53 -04:00
Jorge Escobar	1b17af84c8	Combine + Tilefy (#806 ) * init * score-post * added score csv s3 download; remore poetry cmds from readme * working census tile fetch * PR review * Github Actions Work	2021-11-01 18:05:05 -04:00
Shelby Switzer	7b87e0ec99	Add Score L (#812 ) * Create ScoreCalculator This calculates all the factors for score L for now (with placeholder formulae because this is a WIP). I think ideallly we'll want to refactor all the score code to be extracted into this or similar classes. * Add factor logic for score L Updated factor logic to match score L factors methodology. Still need to get the Score L field itself working. Cleanup needed: Pull field names into constants file, extract all score calculation into score calculator Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov> Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2021-10-28 16:07:41 -04:00
Jorge Escobar	a94b8e2761	final census GHA	2021-10-14 13:50:56 -04:00
Jorge Escobar	8ddfc6b305	Update application.py	2021-10-14 13:31:37 -04:00
Jorge Escobar	3b04356fb3	Data sources from S3 (#769 ) * Started 535 * Data sources from S3 * lint * renove breakpoints * PR comments * lint * census data completed * lint * renaming data source	2021-10-13 16:00:33 -04:00
Billy Daly	d1273b63c5	Add ETL Contract Checks (#619 ) * Adds dev dependencies to requirements.txt and re-runs black on codebase * Adds test and code for national risk index etl, still in progress * Removes test_data from .gitignore * Adds test data to nation_risk_index tests * Creates tests and ETL class for NRI data * Adds tests for load() and transform() methods of NationalRiskIndexETL * Updates README.md with info about the NRI dataset * Adds to dos * Moves tests and test data into a tests/ dir in national_risk_index * Moves tmp_dir for tests into data/tmp/tests/ * Promotes fixtures to conftest and relocates national_risk_index tests: The relocation of national_risk_index tests is necessary because tests can only use fixtures specified in conftests within the same package * Fixes issue with df.equals() in test_transform() * Files reformatted by black * Commit changes to other files after re-running black * Fixes unused import that caused lint checks to fail * Moves tests/ directory to app root for data_pipeline * Adds new methods to ExtractTransformLoad base class: - __init__() Initializes class attributes - _get_census_fips_codes() Loads a dataframe with the fips codes for census block group and tract - validate_init() Checks that the class was initialized correctly - validate_output() Checks that the output was loaded correctly * Adds test for ExtractTransformLoad.__init__() and base.py * Fixes failing flake8 test * Changes geo_col to geoid_col and changes is_dataset to is_census in yaml * Adds test for validate_output() * Adds remaining tests * Removes is_dataset from init method * Makes CENSUS_CSV a class attribute instead of a class global: This ensures that CENSUS_CSV is only set when the ETL class is for a non-census dataset and removes the need to overwrite the value in mock_etl fixture * Re-formats files with black and fixes broken tox tests	2021-10-13 15:54:15 -04:00
Shelby Switzer	d8c73e6a02	Change downloadable file names (#708 ) * Change downloadable file names * Remove constants because we're dynamically creating these * Update to "communities" for the descriptor word based on team convo * Add timestamp in 2020-09-20-0930 format because I personally think this is the best ^.^ * Add a CLI command to run ETL Score Post so that we don't have to run the score generation just to get new downloadable files. * Also make sure the old downloadable files are cleaned up on the run of this command. * Remove unused library, thanks pylint! Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-10-01 15:04:37 -04:00
Jorge Escobar	2f8f2240b4	added new PDF file (#745 )	2021-09-23 13:34:50 -04:00
Lucas Merrill Brown	b1a4d26be8	Adding persistent poverty tracts (#738 ) * persistent poverty working * fixing left-padding * running black and adding persistent poverty to comp tool * fixing bug * running black and fixing linter * fixing linter * fixing linter error	2021-09-22 17:57:08 -04:00
Shelby Switzer	d3a18352fc	Add pytest to tox run in CI/CD (#713 ) * Add pytest to tox run in CI/CD * Try fixing tox dependencies for pytest * update poetry to get ci/cd passing * Run poetry export with --dev flag to include dev dependencies such as pytest * WIP updating test fixtures to include PDF * Remove dev dependencies from reqs and add pytest to envlist to make build faster * passing score_post tests * Add pytest tox (#729) * Fix failing pytest * Fixes failing tox tests and updates requirements.txt to include dev deps * pickle protocol 4 Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov> Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov> Co-authored-by: Billy Daly <williamdaly422@gmail.com> Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>	2021-09-22 13:47:37 -04:00
Vincent La	7709836a12	Ticket 355: Adding map to Urban vs Rural Census Tracts (#696 ) * Adding urban vs rural notebook * Adding new code * Adding settings * Adding usa.csv * Adding etl * Adding etl * Adding to etl_score * quick changes to notebook * Ensuring notebook can run * Adding urban vs rural notebook * Adding new code * Adding settings * Adding usa.csv * Adding etl * Adding etl * Adding to etl_score * quick changes to notebook * Ensuring notebook can run * adding urban to comparison tool * renaming file * adding urban rural to more comp tool outputs * updating requirements and poetry * Adding ej screen notebook * removing ej screen notebook since it's in justice40-tool-iss-719 Co-authored-by: La <ryy0@cdc.gov> Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2021-09-22 12:31:03 -04:00
Jorge Escobar	cd33f323c8	Revised Columns on Download File + PDF (#701 ) * Revised Columns on Download File + PDF * finishing ticket	2021-09-17 13:11:23 -04:00
Lucas Merrill Brown	a1a988da46	Minor updates to scoring comparison tool (#686 ) * Formatting updates for output XLSX	2021-09-16 14:06:33 -05:00
Jorge Escobar	487f6a8e04	Score Indicators (#690 ) * Score Indicators * roudning issue with housing burden column * switching out score g * final list of columns * removing duplicate housing burden percentile fields * removing duplicate Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2021-09-16 10:53:05 -04:00
Lucas Merrill Brown	1c0d87d84b	Add FEMA risk index to score file (#687 ) * Add to score file	2021-09-15 13:31:32 -05:00
Lucas Merrill Brown	e94d05882c	Issue 675 & 676: Adding life expectancy and DOE energy burden data (#683 ) * Adding two new data sources.	2021-09-15 09:59:28 -05:00
Jorge Escobar	fc5ed37fca	dependabot bump pillow (#681 ) * dependabot bump pillow * updated poetry * adding encoding to file open	2021-09-14 17:28:59 -04:00
Lucas Merrill Brown	52e70653f0	Prototype H (#682 )	2021-09-14 16:16:41 -05:00
Jorge Escobar	5bd63c083b	Run all Census, ETL, Score, Combine and Tilefy in one command (#662 ) * Run all Census, ETL, Score, Combine and Tilefy in one command * docker cmd * some docker improvements * feedback updates * lint	2021-09-14 14:15:34 -04:00
Lucas Merrill Brown	1083e953da	Prototype G (#672 ) * wip * cleanup * cleanup 2 * fixing import ordering linter error * updating backend to use score G * adding percentile to score output * update tippeanoe compression Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>	2021-09-14 10:48:11 -04:00
Jorge Escobar	879cb7d022	hotfix wrong score tile csv path (#671 ) * hotfix wrong score tile csv path * updating test * forcing update * triggering action	2021-09-14 07:27:48 -04:00
Lucas Merrill Brown	7d13be7651	Ticket 492: Integrate Area Median Income and Poverty measures into ETL (#660 ) * Loading AMI and poverty data	2021-09-13 15:36:35 -05:00
Shelby Switzer	d7274888b6	Update downloadable zip file (#659 ) * Update downloadable zip file * Don't use spaces in the name, as per #620 * Add the score D columns, as per #596 * fix paths and directories in etl_score_post while the tests seemed to be passing, I encountered an error when running poetry run score, which was caused by us creating a directory called <name>.csv, instead of creating the parent directory. Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-09-10 16:06:47 -04:00
Nat Hillard	536a35d6a0	Data Unit Tests (#509 ) * Fixes #341 - As a J40 developer, I want to write Unit Tests for the ETL files, so that tests are run on each commit * Location bug * Adding Load tests * Fixing XLSX filename * Adding downloadable zip test * updating pickle * Fixing pylint warnings * Updte readme to correct some typos and reorganize test content structure * Removing unused schemas file, adding details to readme around pickles, per PR feedback * Update test to pass with Score D added to score file; update path in readme * fix requirements.txt after merge * fix poetry.lock after merge Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-09-10 14:17:34 -04:00
Shelby Switzer	ac62933d16	Initial refactor for Score ETL (#618 ) * WIP refactor * Exract score calculations into their own methods * do all initial df prep in single method * Fix error in docs for running etl for single dataset * WIP understanding HUD and linguistic iso data * Add comments from initial group review on PR Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-09-10 10:34:34 -04:00
Jorge Escobar	470c474367	Updated README (#652 ) * Updated README - Added a link to the full score data set on S3 - Some Docker updates * typo	2021-09-10 10:15:46 -04:00
Jorge Escobar	327e27e713	Add Score D to USA Low (#629 ) * added score D * Adding Score D to usa-low * rounding score d * small vscode update * last couple of vscode changes * uncommited bscode changes	2021-09-08 16:44:26 -04:00
Jorge Escobar	1953d2fcd8	Additional VSCode and Poetry tasks added (#624 ) * additional tasks added * Update launch.json	2021-09-08 14:54:38 -04:00
dependabot[bot]	f4ffcc6a53	Bump pillow from 8.3.1 to 8.3.2 in /data/data-pipeline (#625 ) Bumps [pillow](https://github.com/python-pillow/Pillow) from 8.3.1 to 8.3.2. - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst) - [Commits](https://github.com/python-pillow/Pillow/compare/8.3.1...8.3.2) --- updated-dependencies: - dependency-name: pillow dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-09-08 13:08:58 -04:00
Billy Daly	f0900f7b69	Adds National Risk Index data to ETL pipeline (#549 ) * Adds dev dependencies to requirements.txt and re-runs black on codebase * Adds test and code for national risk index etl, still in progress * Removes test_data from .gitignore * Adds test data to nation_risk_index tests * Creates tests and ETL class for NRI data * Adds tests for load() and transform() methods of NationalRiskIndexETL * Updates README.md with info about the NRI dataset * Adds to dos * Moves tests and test data into a tests/ dir in national_risk_index * Moves tmp_dir for tests into data/tmp/tests/ * Promotes fixtures to conftest and relocates national_risk_index tests: The relocation of national_risk_index tests is necessary because tests can only use fixtures specified in conftests within the same package * Fixes issue with df.equals() in test_transform() * Files reformatted by black * Commit changes to other files after re-running black * Fixes unused import that caused lint checks to fail * Moves tests/ directory to app root for data_pipeline	2021-09-07 20:51:34 -04:00
Jorge Escobar	94298635c2	Add to decimal rounding (#623 ) * added score D * forgot to add decimal rounding	2021-09-07 14:30:45 -04:00
Jorge Escobar	99503a2541	added score D (#621 )	2021-09-07 13:37:16 -04:00
Jorge Escobar	f5ba63977a	Hotfix for Readme and ACS File name (#563 )	2021-08-24 17:01:12 -04:00
Lucas Merrill Brown	65ceb7900f	Score F, testing methodology (#510 ) * fixing dependency issue * fixing more dependencies * including fraction of state AMI * wip * nitpick whitespace * etl working now * wip on scoring * fix rename error * reducing metrics * fixing score f * fixing readme * adding dependency * passing tests; * linting/black * removing unnecessary sample * fixing error * adding verify flag on etl/base Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>	2021-08-24 16:40:54 -04:00
Jorge Escobar	c24e13c930	Update GHA to push only client changes to S3 (#543 )	2021-08-16 17:00:43 -04:00
Shelby Switzer	2c79396550	Initial draft for data provenance addition to README (#528 ) * Initial draft for data provenance We want to make the data usable/available at every step of our data pipeline. This starts te addition to the README that spells out the data provenance and where each version of the data as it goes through our pipeline lives. * Update README with placeholders for next steps in data provenance * Add coming soon placeholders for remaining data locations Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-08-16 16:45:54 -04:00
Jorge Escobar	c19cd3ee55	hotfix on float cols (#526 )	2021-08-13 15:48:31 -04:00
Vim	1dbb1018d6	sets column as percentiles (#525 ) * sets column as percentiles * adds trailing comma	2021-08-13 12:01:34 -07:00
Jorge Escobar	773c035493	AWS Sync Public Read (#508 ) * adding layer to mvts * small fix for GHA * AWS Sync Public Read * removed temp file * updated state media income ftp	2021-08-12 14:17:25 -04:00

1 2

76 commits