j40-cejst-2

mirror of https://github.com/DOI-DO/j40-cejst-2.git synced 2025-02-23 10:04:18 -08:00

Author	SHA1	Message	Date
Katherine D. Mlika	68c882b3de	updating column E label to "Identified as disadvantaged" (#1406 ) * updating column E label to "Identified as disadvantaged" * passing tests * adding cached poetry flow * working dir Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>	2022-03-18 14:50:03 -04:00
Jorge Escobar	7b05ee9c76	S3 Parallel Upload and Deletions (#1410 ) * installation step * trigger action * installing to home dir * dry-run * pyenv * py 2.8 * trying s4cmd * removing pyenv * poetry s4cmd * num-threads * public read * poetry cache * s4cmd all around * poetry cache * poetry cache * install poetry packages * poetry echo * let's do this * s4cmd install on run * s4cmd * ad aws back * add aws back * testing census api key and poetry caching * census api key * census api * census api key #3 * 250 * poetry update * poetry change * check census api key * force flag * update score gen and tilefy; remove cached fips * small gdal update * invalidation * missing cache ids	2022-03-17 23:19:23 -04:00
Emma Nechamkin	e7c7c0abeb	Updating higher education to be reversed (#1387 ) Summary In this PR, we create a new variable so that the % college students is expressed as % not college students. This means that the front end can display % not college students. Includes old variables so that this will not break fe.	2022-03-15 16:43:32 -04:00
Emma Nechamkin	2279a04c94	Quick fix: updating snapshots to have more sigfigs (#1409 ) Updated snapshots to include 10 digits after the decimal	2022-03-14 21:44:35 -04:00
Emma Nechamkin	9d920d4db4	Updating testing to include pytest-snapshot (#1355 ) In this commit, we slightly change the testing to use `pytest-snapshot`. This is for `ETL`s only.	2022-03-11 21:34:07 -05:00
Jorge Escobar	7f91e2b06b	ArcGIS zipping (#1391 ) * ArcGIS zipping * lint * shapefile zip * removing space in GMT * adding shapefile to be staging gha	2022-03-09 18:00:20 -05:00
Jorge Escobar	1730572aa6	Reducing Docker start up and adding ArcGIS URL (#1386 ) * Reducing Docker start up and adding ArcGIS URL * Updating ArcGIS URLs	2022-03-09 08:55:17 -05:00
Emma Nechamkin	917b84dc2e	WY tracts are not showing up until zoom >7 (#1342 ) In order to solve an issue where states with few census tracts appear to have no DACs, we change the low-zoom for states with under some threshold of tracts to be the high-zoom for those states. Thus, WY now has DACs even in low zoom. Yay!	2022-03-08 17:33:11 -05:00
Jorge Escobar	6425beb9f4	YAML Config for Downloadable Assets (#1252 ) * starting yaml config load work * working version for downloadable file * yaml file update * checkpoint * sort if needed * refactoring * moving config * checkpoint * old files * skipping downloadble tests for now * more modularization * more refactor, new excel yml * pylint * completed tabs * Update excel.yml * remvoing obsolete tests * addressing PR feedback * addressing changes * confirmed change in yaml breaks tests * safety bump * PR review * adding tests back * pylint * Incorporating latest score fields from Emma * incorporating newest fields from Emma * passing tests * adding shapefile aws sync * missing test * passing tests	2022-03-04 15:02:09 -05:00
Emma Nechamkin	1f5633ef74	Adding constants for front end to display booleans (#1348 ) Added constants for the threshold categories and socioeconomic indicators for front end.	2022-03-02 17:12:28 -05:00
Emma Nechamkin	aea49cbb5a	Cleaning up quick code (#1349 ) Did some quick, mostly cosmetic changes and updates to the quick launch changes. This mostly entailed changing strings to constants and cleaning up some code to make it neater. Changes -- PR AMI, updating ag loss, and dropping pr from some threshold counts.	2022-03-02 16:50:04 -05:00
Emma Nechamkin	f9be97d8c8	This is a quick addition to include PR AMI. To be revised in the "clean up code" pr	2022-03-01 16:31:38 -05:00
Jorge Escobar	dac8ed29d5	Removing PDF from packet (#1306 )	2022-03-01 13:41:44 -05:00
Emma Nechamkin	fab828dc66	Updating tiles csv to include state code (#1272 ) Adding state codes for island areas and puerto rico to the tiles csv.	2022-02-25 11:10:09 -05:00
Emma Nechamkin	f0a4e40a79	Creating shapefiles for ArcGIS users (#1275 ) Added shapefiles to the files generated when the pipeline is run. Produces both shapefile and a key for column names.	2022-02-24 10:32:49 -05:00
Lucas Merrill Brown	6e64134dc6	1295-college-attendance-field (#1297 ) Lucas' work. Adding college attendance to tiles.	2022-02-17 19:50:52 -05:00
Emma Nechamkin	cee13b50cc	Stripping thresholds from PR so the UI matches the count Add a tuple to skip FIPS 72 when incrementing counter. TODO: clean up so it's a constant.	2022-02-17 16:54:33 -05:00
Emma Nechamkin	1b76a68838	FEMA data check (#1270 ) we wanted to implement a slightly different FEMA AG LOSS indicator. Here, we take the 90th percentile only of tracts that have agvalue, and then we also floor the denominator of the rate calculation (loss/total value) at $408k	2022-02-17 16:53:04 -05:00
Vim	f90125d1b4	Update side panel to 3-state design (#1276 ) * Update field name to follow constant standard * Add table to ETL commands to README * Update Generate Map Tiles run time * Add a comma to copy * Add 3 state UI experience - PR will only show workforce dev - IA will only show workforce dev w/o linguistic iso - update tests to tests 3 states - change state to territory for Island Areas * Modify PR and IA threshold counts * Update tile_data_expected.pkl file	2022-02-16 14:24:35 -08:00
Jorge Escobar	59862a098e	Test Staging Data Backend (#1282 ) * Test Staging Data Backend * action updates	2022-02-16 16:45:59 -05:00
Jorge Escobar	82809a5123	Github Actions for Staging Backend (#1281 ) * Github Actions for Staging Backend * trigger run	2022-02-16 16:40:25 -05:00
Lucas Merrill Brown	3e37d9d1a3	Issue 1075: update snapshots using command-line flag (#1249 ) * Adding skippable tests using command-line flag	2022-02-14 12:16:52 -05:00
Lucas Merrill Brown	a0d6e55f0a	Run ETL processes in parallel (#1253 ) * WIP on parallelizing * switching to get_tmp_path for nri * switching to get_tmp_path everywhere necessary * fixing linter errors * moving heavy ETLs to front of line * add hold * moving cdc places up * removing unnecessary print * moving h&t up * adding parallel to geo post * better census labels * switching to concurrent futures * fixing output	2022-02-11 14:04:53 -05:00
Emma Nechamkin	389eb59ac4	Adding island area indicators to the tiles (#1213 ) This updates the backend to produce tile data with island indicators / island fields. Contains: - new tile codes for island data - threshold column that specifies number of thresholds to show - ui experience column that specifies which ui experience to show TODO: Drop the logger info message from main :)	2022-02-09 20:33:42 -05:00
Emma Nechamkin	b86450c72b	Remove USVI and Guam territories from data and include/show on map American Samoa and Mariana Islands (#1248 ) This updates the tile data so that guam and usvi do not appear in the tiles csv, from issue 1003	2022-02-09 15:23:37 -05:00
Lucas Merrill Brown	43e005cc10	Issue 1075: Add refactored ETL tests to NRI (#1088 ) * Adds a substantially refactored ETL test to the National Risk Index, to be used as a model for other tests	2022-02-08 19:05:32 -05:00
Jorge Escobar	f5fe8d90e2	Excel formatting and tract id ordering (#1172 ) * excel formatting and tract id ordering * lint * lint try $2 * lint 3 * addressed comments * typo	2022-02-04 18:35:45 -05:00
Emma Nechamkin	6a00b29f5d	Adding VA and CO ETL from mapping for environmental justice (#1177 ) Adding the mapping for environmental justice data, which contains information about VA and CO, to the ETL pipeline.	2022-02-04 10:00:41 -05:00
Jorge Escobar	1d399d3ca9	Tox Security Fix (#1242 ) * checkpoint * safety ignore * update python matrix for data checks * downloading census once	2022-02-03 17:05:51 -05:00
Emma Nechamkin	49868401be	Updating field names to match score M definitions (#1190 ) When implementing definition M for the score, the variable names were not yet updated. For example: This legacy field naming: ``` UNEMPLOYMENT_LOW_HS_EDUCATION_FIELD = ( f"Greater than or equal to the {PERCENTILE}th percentile for unemployment" " and has low HS education" ) ``` Should actually be renamed something like this: ``` UNEMPLOYMENT_LOW_HS_LOW_HIGHER_ED_FIELD = ( f"Greater than or equal to the {PERCENTILE}th percentile for unemployment" " and has low HS education and low higher ed attendance" ) ``` This PR is for the backend updates for this -- keeping the old fields, and adding new, Score M specific fields as listed below: - [x] `field_names`: add new fields to capture low_higher_ed - [x] `score_m`: replace old fields with new fields - [x] `DOWNLOADABLE_SCORE_COLUMNS`: replace old fields with new fields - [x] `TILES_SCORE_COLUMNS`: replace old fields with new fields	2022-02-01 18:54:43 -05:00
Jorge Escobar	403a490985	Esfoobar usds/488 generate score per commit pr (#1211 ) * Score run on every commit to data PR * testing score run * source aws	2022-01-31 16:07:21 -05:00
dependabot[bot]	8b72f743e3	Bump pillow from 8.4.0 to 9.0.0 in /data/data-pipeline (#1136 ) * Bump pillow from 8.4.0 to 9.0.0 in /data/data-pipeline Bumps [pillow](https://github.com/python-pillow/Pillow) from 8.4.0 to 9.0.0. - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - [Commits](https://github.com/python-pillow/Pillow/compare/8.4.0...9.0.0) --- updated-dependencies: - dependency-name: pillow dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> * pillow bump Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>	2022-01-27 18:19:49 -05:00
dependabot[bot]	4a83ae458e	Bump ipython from 7.28.0 to 7.31.1 in /data/data-pipeline (#1169 ) Bumps [ipython](https://github.com/ipython/ipython) from 7.28.0 to 7.31.1. - [Release notes](https://github.com/ipython/ipython/releases) - [Commits](https://github.com/ipython/ipython/compare/7.28.0...7.31.1) --- updated-dependencies: - dependency-name: ipython dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-01-27 17:36:14 -05:00
Jorge Escobar	2b35a8937a	Hot fix for Score M (#1182 ) * fixes * pr feedback * tuple	2022-01-27 17:22:39 -05:00
Emma Nechamkin	4c7d729cf7	Issue 1140 loss rate rounding (#1170 ) * updated loss rate rounding * fixing a typo in variable name * fixing typo in variable name * oops, now ready to push * updated pickle with float for loss rate columns * updated a typo, now multiplies all loss rates by 100 consistent with other pcts * updated with final pickles, all tests passing * updated incorporating lucas pr comments * changed literal to field name	2022-01-26 13:57:45 -05:00
Lucas Merrill Brown	18f299c5f8	Issue 1141: Definition M (#1151 )	2022-01-18 14:56:55 -05:00
Saran Ahluwalia	a07bf752b0	Notebook investigating NHPD as a source for providing contemporary foreclosure data (#1012 ) Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>	2022-01-18 13:08:27 -05:00
Saran Ahluwalia	87e08f5fe1	CDC SVI Index: Additions to data-pipeline and comparison tool (#1096 ) * wip * working * working * rename * documentation * add link * add readme * update fieldnames * typo * add comparison tool * revise wording * variable change for FIPS * workding * wording in readme * cleanup wording * update comparison tool * final tune up * grammar and punctuation in the documentation * period * cleanup comments * added revisions * parallelism * PR feedback from Lucas * remove extraneous fields from comparison tool * style * updates * remove themes * formatting * remove referenes to percentile rank * remove referenes to percentile rank * typo in fieldnames * updates based on feedback from Lucas * fieldnames formatting * fix broken markdown link Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2022-01-14 14:52:37 -05:00
Saran Ahluwalia	95a14adb35	Added Census Tract Aggregated Micro-data from EPA Risk-Screening Environmental Indicators (RSEI) model (#1101 ) * added initial source code - todo is comparison tool * added values * rename fields * check geoid * added black * added revisions * added clean up to comments * more comments * formatting * cleanup and address PR feedback * fix changes * final path changes * style * PR feedback * added final PR comment * fix flake 8 * add revisions	2022-01-14 13:50:49 -05:00
Saran Ahluwalia	a98ea35f74	Maryland EJSCREEN Addition to comparison tool (#1143 ) * finalized * cleanup notebook * cleanup * run black	2022-01-14 13:26:48 -05:00
Saran Ahluwalia	2604b66cf7	Fix errors and improve code quality and readability in Health Scores (#1147 ) * run black on health_score.py * to_numpy() versus values - see https://pandas.pydata.org/pandas-docs/version/0.24.0rc1/api/generated/pandas.Series.to_numpy.html	2022-01-14 13:11:47 -05:00
Jorge Escobar	d686bb856e	Download column order completed (#1077 ) * Download column order completed * Kameron changes * Lucas and Beth column order changes * cdc_places update * passing score * pandas error * checkpoint * score passing * rounding complete - percentages still showing one decimal * fixing tests * fixing percentages * updating comment * int percentages! 🎉🎉 * forgot to pass back to df * passing tests Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2022-01-13 15:04:16 -05:00
Saran Ahluwalia	98ff4bd9d8	Add experimental Jupyter notebook with Health Scoring Methodology Example for Health Scores (#989 ) Co-authored-by: Saran Ahluwalia <sarahluw@cisco.com>	2022-01-13 14:43:27 -05:00
Shaun Verch	4cec1bb37e	Install and run pandas-vet (#1119 ) * Install and run pandas-vet This doesn't fix the errors, but it can give us a starting point for the discussion of which of these errors we care about. * Ignore the errors for now * Ignore eeoc.gov in link checker Sometimes it seems down from the perspective of github actions.	2022-01-13 13:17:30 -05:00
Shaun Verch	73d6aa937d	Add pyproject.toml to fix docker compose build (#1131 ) * Add pyproject.toml to fix docker compose build Even though we want to use locked dependencies, pyproject.toml is still required. * update Dockerfile Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>	2022-01-13 13:05:32 -05:00
Lucas Merrill Brown	114e6b765a	Issue 1129: remove deprecated field other_census_tract_fields_to_keep (#1130 )	2022-01-12 10:16:09 -05:00
Shaun Verch	0abf04d6c2	Remove requirements.txt as a dependency (#1111 ) * Remove requirements.txt as a dependency This converts both docker and tox to use poetry, eliminating usage of requirements.txt in both flows. - In tox, uses the tox-poetry package which installs dependencies from the lockfile. - In docker, uses https://stackoverflow.com/questions/53835198/integrating-python-poetry-with-docker as a reference. * Don't copy pyproject.toml * Remove obsoleted docs about requirements.txt * Add --full-trace option to pytest * Fix liccheck liccheck works with requirements.txt, not with poetry, so there needs to be an extra translation step. * TEMP: Add WIP fix for pandas issue This is just to see if the github actions would pass once this fix gets merged, but it's being reviewed separately. * Revert "TEMP: Add WIP fix for pandas issue" This reverts commit 06e38e8cc77f5f3105c6e7a9449901db67aa1c82.	2022-01-10 16:43:56 -05:00
Saran Ahluwalia	56644698ff	Address rounding issue in Pandas series to floor numerically unstable values (#1085 ) * wip - added tests - 1 failing * added check for empty series + added test * passing tests * parallelism in variable assingnment choice * resolve merge conflicts * variable name changes * cleanup logic and move comments out of main code execution + add one more test for an extreme example eith -np.inf * cleanup logic and move comments out of main code execution + add one more test for an extreme example eith -np.inf * revisions to handle type ambiguity * fixing tests * fix pytest * fix linting * fix pytest * reword comments * cleanup comments * cleanup comments - fix typo * added type check and corresponding test * added type check and corresponding test * language cleanup * revert * update picke fixture Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>	2022-01-05 17:03:37 -05:00
Shaun Verch	93595b7bb4	Re-export requirements.txt to fix version errors (#1099 ) * Re-export requirements.txt to fix version errors The version of lxml in this file had a known vulnerability that got caught by the "safety" checker, but it is updated in the poetry files. Regenerated using: https://github.com/usds/justice40-tool/tree/main/data/data-pipeline#miscellaneous * Fix lint error * Run lint on all envs and add comments * Ignore testst that fail lint because of dev deps * Ignore medium.com in link checker It's returning 403s to github actions...	2022-01-05 15:58:24 -05:00
Saran Ahluwalia	a4137fdc98	Add Michigan EJ Screen into data-pipeline's ETL and provide automated scoring and statistics outputs (#1091 ) * draft wip * initial commit * clear output from notebook * revert to `65ceb7900f` * draft wip * initial commit * clear output from notebook * revert to `65ceb7900f` * make michigan prefix for readable * standardize Michigan names and move all constants from class into field names module * standardize Michigan names and move all constants from class into field names module * include only pertinent columns for scoring comparison tool * michigan EJSCREEN standardization * final PR feedback * added exposition and summary of Michigan EJSCREEN * added exposition and summary of Michigan EJSCREEN * fix typo Co-authored-by: Saran Ahluwalia <ahlusar.ahluwalia@gmail.com>	2021-12-31 15:38:52 -05:00

1 2 3 4

176 commits