j40-cejst-2

mirror of https://github.com/DOI-DO/j40-cejst-2.git synced 2025-02-23 10:04:18 -08:00

Author	SHA1	Message	Date
Shelby Switzer	617f41526f	Update Census AMI to ETL into tracts, not CBGs (#900 ) * Update Census AMI to ETL into tracts, not CBGs Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov> Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2021-11-30 13:49:20 -05:00
Lucas Merrill Brown	537844236a	Update FEMA data to be tracts, not block groups (#906 )	2021-11-30 13:49:20 -05:00
Shelby Switzer	893758f1d4	Use tract instead of block group when calling census API (#901 ) * Use tract instead of block group when calling census API * fixing merge conflicts Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov> Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2021-11-30 13:49:20 -05:00
Shelby Switzer	0c8b32e679	Move Housing and Transportation Index to tracts (#903 ) Update data download URL to use tract as focus, use tract field name, and move this dataset to the tracts df list in etl_score. Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-11-30 13:49:20 -05:00
Lucas Merrill Brown	776a52595f	Switching island territories data to tracts (#879 )	2021-11-30 13:49:20 -05:00
Saran Ahluwalia	b0c176daee	Remove inplace argument to prevent SettingWithCopyError (#899 ) * removed inplace argument to prevent copies of dataframe to be set and chained assignment to propogate and raise exception * removed inplace argument to prevent copies of dataframe to be set and chained assignment to propogate and raise exception * remove superfluous pandas options that affects flake results * remove (again) the same chained assignment from previous merge Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>	2021-11-29 13:27:23 -05:00
Saran Ahluwalia	ec8f3543e5	Remove Index related to FEMA (#917 ) Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>	2021-11-24 16:50:09 -05:00
Lucas Merrill Brown	474d010bf4	Quick fix on island territories directory name (#877 )	2021-11-16 14:31:11 -05:00
Jorge Escobar	0a21fc6b12	Add territory boundary data (#885 ) * Add territory boundary data * housing and transp * lint * lint * lint	2021-11-16 10:05:09 -05:00
Lucas Merrill Brown	e8d64df510	Fixing missing FEMA fields (#892 )	2021-11-15 11:06:44 -05:00
Lucas Merrill Brown	21834b4a91	Issue 883: Update FEMA risk index measure (#884 ) * ETL updated * Adding three fields to score	2021-11-13 11:32:15 -05:00
Lucas Merrill Brown	05ebf9b48c	Add median house value to Definition L (#882 ) * Added house value to ETL * Adding house value to score formula and comp tool	2021-11-13 10:29:23 -05:00
Vincent La	b0dbc90064	[ISS-723] Load Census Data for 4 Territories (#816 ) * Adding census decennial data for island territories	2021-11-09 16:32:46 -05:00
Jorge Escobar	053dde0d40	Display score L on map (#849 ) * updates to first docker run * tile constants * frontend changes * updating pickles instructions * pickles	2021-11-05 16:26:14 -04:00
Lucas Merrill Brown	03e59f2abd	Definition L updates (#862 ) * Changing FEMA risk measure * Adding "basic stats" feature to comparison tool * Tweaking Definition L	2021-11-05 15:43:52 -04:00
Shelby Switzer	a0bf186ee6	Add percentile column for L (#851 ) * Add percentile column for L * Use Definition instead of Score Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-11-04 13:03:56 -04:00
Lucas Merrill Brown	8372b47d42	Various updates to Definition L (#850 ) * removing percentiles as separate field names * adding RMP	2021-11-04 12:17:45 -04:00
Lucas Merrill Brown	1d541be447	Add EJSCREEN Areas of Concern (#843 ) * Adding ej screen areas of concern * Uses it where user has local files, but not otherwise Co-authored-by: VincentLaUSDS <vincent.la@omb.eop.gov>	2021-11-02 15:38:42 -04:00
Shelby Switzer	7bd1a9e59e	Big ole score refactor (#815 ) * WIP * Create ScoreCalculator This calculates all the factors for score L for now (with placeholder formulae because this is a WIP). I think ideallly we'll want to refactor all the score code to be extracted into this or similar classes. * Add factor logic for score L Updated factor logic to match score L factors methodology. Still need to get the Score L field itself working. Cleanup needed: Pull field names into constants file, extract all score calculation into score calculator * Update thresholds and get score L calc working * Update header name for consistency and update comparison tool * Initial move of score to score calculator * WIP big refactor * Continued WIP on score refactor * WIP score refactor * Get to a working score-run * Refactor to pass df to score init This makes it easier to pass df around within a class with multiple methods that require df. * Updates from Black * Updates from linting * Use named imports instead of wildcard; log more * Additional refactors * move more field names to field_names constants file * import constants without a relative path (would break docker) * run linting * raise error if add_columns is not implemented in a child class * Refactor dict to namedtuple in score c * Update L to use all percentile field * change high school ed field in L back Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-11-02 14:12:53 -04:00
Jorge Escobar	1b17af84c8	Combine + Tilefy (#806 ) * init * score-post * added score csv s3 download; remore poetry cmds from readme * working census tile fetch * PR review * Github Actions Work	2021-11-01 18:05:05 -04:00
Shelby Switzer	7b87e0ec99	Add Score L (#812 ) * Create ScoreCalculator This calculates all the factors for score L for now (with placeholder formulae because this is a WIP). I think ideallly we'll want to refactor all the score code to be extracted into this or similar classes. * Add factor logic for score L Updated factor logic to match score L factors methodology. Still need to get the Score L field itself working. Cleanup needed: Pull field names into constants file, extract all score calculation into score calculator Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov> Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2021-10-28 16:07:41 -04:00
Jorge Escobar	a94b8e2761	final census GHA	2021-10-14 13:50:56 -04:00
Jorge Escobar	8ddfc6b305	Update application.py	2021-10-14 13:31:37 -04:00
Jorge Escobar	3b04356fb3	Data sources from S3 (#769 ) * Started 535 * Data sources from S3 * lint * renove breakpoints * PR comments * lint * census data completed * lint * renaming data source	2021-10-13 16:00:33 -04:00
Billy Daly	d1273b63c5	Add ETL Contract Checks (#619 ) * Adds dev dependencies to requirements.txt and re-runs black on codebase * Adds test and code for national risk index etl, still in progress * Removes test_data from .gitignore * Adds test data to nation_risk_index tests * Creates tests and ETL class for NRI data * Adds tests for load() and transform() methods of NationalRiskIndexETL * Updates README.md with info about the NRI dataset * Adds to dos * Moves tests and test data into a tests/ dir in national_risk_index * Moves tmp_dir for tests into data/tmp/tests/ * Promotes fixtures to conftest and relocates national_risk_index tests: The relocation of national_risk_index tests is necessary because tests can only use fixtures specified in conftests within the same package * Fixes issue with df.equals() in test_transform() * Files reformatted by black * Commit changes to other files after re-running black * Fixes unused import that caused lint checks to fail * Moves tests/ directory to app root for data_pipeline * Adds new methods to ExtractTransformLoad base class: - __init__() Initializes class attributes - _get_census_fips_codes() Loads a dataframe with the fips codes for census block group and tract - validate_init() Checks that the class was initialized correctly - validate_output() Checks that the output was loaded correctly * Adds test for ExtractTransformLoad.__init__() and base.py * Fixes failing flake8 test * Changes geo_col to geoid_col and changes is_dataset to is_census in yaml * Adds test for validate_output() * Adds remaining tests * Removes is_dataset from init method * Makes CENSUS_CSV a class attribute instead of a class global: This ensures that CENSUS_CSV is only set when the ETL class is for a non-census dataset and removes the need to overwrite the value in mock_etl fixture * Re-formats files with black and fixes broken tox tests	2021-10-13 15:54:15 -04:00
Shelby Switzer	d8c73e6a02	Change downloadable file names (#708 ) * Change downloadable file names * Remove constants because we're dynamically creating these * Update to "communities" for the descriptor word based on team convo * Add timestamp in 2020-09-20-0930 format because I personally think this is the best ^.^ * Add a CLI command to run ETL Score Post so that we don't have to run the score generation just to get new downloadable files. * Also make sure the old downloadable files are cleaned up on the run of this command. * Remove unused library, thanks pylint! Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-10-01 15:04:37 -04:00
Jorge Escobar	2f8f2240b4	added new PDF file (#745 )	2021-09-23 13:34:50 -04:00
Lucas Merrill Brown	b1a4d26be8	Adding persistent poverty tracts (#738 ) * persistent poverty working * fixing left-padding * running black and adding persistent poverty to comp tool * fixing bug * running black and fixing linter * fixing linter * fixing linter error	2021-09-22 17:57:08 -04:00
Shelby Switzer	d3a18352fc	Add pytest to tox run in CI/CD (#713 ) * Add pytest to tox run in CI/CD * Try fixing tox dependencies for pytest * update poetry to get ci/cd passing * Run poetry export with --dev flag to include dev dependencies such as pytest * WIP updating test fixtures to include PDF * Remove dev dependencies from reqs and add pytest to envlist to make build faster * passing score_post tests * Add pytest tox (#729) * Fix failing pytest * Fixes failing tox tests and updates requirements.txt to include dev deps * pickle protocol 4 Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov> Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov> Co-authored-by: Billy Daly <williamdaly422@gmail.com> Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>	2021-09-22 13:47:37 -04:00
Vincent La	7709836a12	Ticket 355: Adding map to Urban vs Rural Census Tracts (#696 ) * Adding urban vs rural notebook * Adding new code * Adding settings * Adding usa.csv * Adding etl * Adding etl * Adding to etl_score * quick changes to notebook * Ensuring notebook can run * Adding urban vs rural notebook * Adding new code * Adding settings * Adding usa.csv * Adding etl * Adding etl * Adding to etl_score * quick changes to notebook * Ensuring notebook can run * adding urban to comparison tool * renaming file * adding urban rural to more comp tool outputs * updating requirements and poetry * Adding ej screen notebook * removing ej screen notebook since it's in justice40-tool-iss-719 Co-authored-by: La <ryy0@cdc.gov> Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2021-09-22 12:31:03 -04:00
Jorge Escobar	cd33f323c8	Revised Columns on Download File + PDF (#701 ) * Revised Columns on Download File + PDF * finishing ticket	2021-09-17 13:11:23 -04:00
Lucas Merrill Brown	a1a988da46	Minor updates to scoring comparison tool (#686 ) * Formatting updates for output XLSX	2021-09-16 14:06:33 -05:00
Jorge Escobar	487f6a8e04	Score Indicators (#690 ) * Score Indicators * roudning issue with housing burden column * switching out score g * final list of columns * removing duplicate housing burden percentile fields * removing duplicate Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2021-09-16 10:53:05 -04:00
Lucas Merrill Brown	1c0d87d84b	Add FEMA risk index to score file (#687 ) * Add to score file	2021-09-15 13:31:32 -05:00
Lucas Merrill Brown	e94d05882c	Issue 675 & 676: Adding life expectancy and DOE energy burden data (#683 ) * Adding two new data sources.	2021-09-15 09:59:28 -05:00
Jorge Escobar	fc5ed37fca	dependabot bump pillow (#681 ) * dependabot bump pillow * updated poetry * adding encoding to file open	2021-09-14 17:28:59 -04:00
Lucas Merrill Brown	52e70653f0	Prototype H (#682 )	2021-09-14 16:16:41 -05:00
Jorge Escobar	5bd63c083b	Run all Census, ETL, Score, Combine and Tilefy in one command (#662 ) * Run all Census, ETL, Score, Combine and Tilefy in one command * docker cmd * some docker improvements * feedback updates * lint	2021-09-14 14:15:34 -04:00
Lucas Merrill Brown	1083e953da	Prototype G (#672 ) * wip * cleanup * cleanup 2 * fixing import ordering linter error * updating backend to use score G * adding percentile to score output * update tippeanoe compression Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>	2021-09-14 10:48:11 -04:00
Jorge Escobar	879cb7d022	hotfix wrong score tile csv path (#671 ) * hotfix wrong score tile csv path * updating test * forcing update * triggering action	2021-09-14 07:27:48 -04:00
Lucas Merrill Brown	7d13be7651	Ticket 492: Integrate Area Median Income and Poverty measures into ETL (#660 ) * Loading AMI and poverty data	2021-09-13 15:36:35 -05:00
Shelby Switzer	d7274888b6	Update downloadable zip file (#659 ) * Update downloadable zip file * Don't use spaces in the name, as per #620 * Add the score D columns, as per #596 * fix paths and directories in etl_score_post while the tests seemed to be passing, I encountered an error when running poetry run score, which was caused by us creating a directory called <name>.csv, instead of creating the parent directory. Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-09-10 16:06:47 -04:00
Nat Hillard	536a35d6a0	Data Unit Tests (#509 ) * Fixes #341 - As a J40 developer, I want to write Unit Tests for the ETL files, so that tests are run on each commit * Location bug * Adding Load tests * Fixing XLSX filename * Adding downloadable zip test * updating pickle * Fixing pylint warnings * Updte readme to correct some typos and reorganize test content structure * Removing unused schemas file, adding details to readme around pickles, per PR feedback * Update test to pass with Score D added to score file; update path in readme * fix requirements.txt after merge * fix poetry.lock after merge Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-09-10 14:17:34 -04:00
Shelby Switzer	ac62933d16	Initial refactor for Score ETL (#618 ) * WIP refactor * Exract score calculations into their own methods * do all initial df prep in single method * Fix error in docs for running etl for single dataset * WIP understanding HUD and linguistic iso data * Add comments from initial group review on PR Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-09-10 10:34:34 -04:00
Jorge Escobar	470c474367	Updated README (#652 ) * Updated README - Added a link to the full score data set on S3 - Some Docker updates * typo	2021-09-10 10:15:46 -04:00
Jorge Escobar	327e27e713	Add Score D to USA Low (#629 ) * added score D * Adding Score D to usa-low * rounding score d * small vscode update * last couple of vscode changes * uncommited bscode changes	2021-09-08 16:44:26 -04:00
Jorge Escobar	1953d2fcd8	Additional VSCode and Poetry tasks added (#624 ) * additional tasks added * Update launch.json	2021-09-08 14:54:38 -04:00
dependabot[bot]	f4ffcc6a53	Bump pillow from 8.3.1 to 8.3.2 in /data/data-pipeline (#625 ) Bumps [pillow](https://github.com/python-pillow/Pillow) from 8.3.1 to 8.3.2. - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst) - [Commits](https://github.com/python-pillow/Pillow/compare/8.3.1...8.3.2) --- updated-dependencies: - dependency-name: pillow dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2021-09-08 13:08:58 -04:00
Billy Daly	f0900f7b69	Adds National Risk Index data to ETL pipeline (#549 ) * Adds dev dependencies to requirements.txt and re-runs black on codebase * Adds test and code for national risk index etl, still in progress * Removes test_data from .gitignore * Adds test data to nation_risk_index tests * Creates tests and ETL class for NRI data * Adds tests for load() and transform() methods of NationalRiskIndexETL * Updates README.md with info about the NRI dataset * Adds to dos * Moves tests and test data into a tests/ dir in national_risk_index * Moves tmp_dir for tests into data/tmp/tests/ * Promotes fixtures to conftest and relocates national_risk_index tests: The relocation of national_risk_index tests is necessary because tests can only use fixtures specified in conftests within the same package * Fixes issue with df.equals() in test_transform() * Files reformatted by black * Commit changes to other files after re-running black * Fixes unused import that caused lint checks to fail * Moves tests/ directory to app root for data_pipeline	2021-09-07 20:51:34 -04:00
Jorge Escobar	94298635c2	Add to decimal rounding (#623 ) * added score D * forgot to add decimal rounding	2021-09-07 14:30:45 -04:00

1 2 3 4

188 commits