j40-cejst-2

mirror of https://github.com/DOI-DO/j40-cejst-2.git synced 2025-02-23 10:04:18 -08:00

Author	SHA1	Message	Date
Emma Nechamkin	5c41c95764	Revert "Fast flag update (#1844 )" This reverts commit `d892bce6cf`.	2022-08-19 14:05:45 -04:00
Emma Nechamkin	d892bce6cf	Fast flag update (#1844 ) Added additional flags for the front end based on our conversation in stand up this morning.	2022-08-19 13:14:44 -04:00
Emma Nechamkin	3ba1c620f5	Update to use new FSF files (#1838 ) backend is partially done!	2022-08-18 15:54:44 -04:00
Emma Nechamkin	cb4866b93f	Adding eamlis and fuds data to legacy pollution in score (#1832 ) Update to add EAMLIS and FUDS data to score	2022-08-18 13:32:29 -04:00
Matt Bowen	6e41e0d9f0	Add donut hole calculation to score (#1828 ) Adds adjacency index to the pipeline. Requires thorough QA	2022-08-18 12:04:46 -04:00
Emma Nechamkin	88dc2e5a8e	updating to avoid conflicts	2022-08-17 14:28:02 -04:00
Emma Nechamkin	7d89d41e49	Adding NLCD data (#1826 ) Adding NLCD's natural space indicator end to end to the score.	2022-08-17 14:21:28 -04:00
Emma Nechamkin	2e05b1d60c	Merge branch 'emma-nechamkin/release/score-narwhal' of github.com:usds/justice40-tool into emma-nechamkin/release/score-narwhal	2022-08-17 11:34:37 -04:00
Matt Bowen	49623e4da0	Add abandoned mine lands data (#1824 ) * Add notebook to generate test data (#1780) * Add Abandoned Mine Land data (#1780) Using a similar structure but simpler apporach compared to FUDs, add an indicator for whether a tract has an abandonded mine. * Adding some detail to dataset readmes Just a thought! * Apply feedback from revieiw (#1780) * Fixup bad string that broke test (#1780) * Update a string that I should have renamed (#1780) * Reduce number of threads to reduce memory pressure (#1780) * Try not running geo data (#1780) * Run the high-memory sets separately (#1780) * Actually deduplicate (#1780) * Add flag for memory intensive ETLs (#1780) * Document new flag for datasets (#1780) * Add flag for new datasets fro rebase (#1780) Co-authored-by: Emma Nechamkin <97977170+emma-nechamkin@users.noreply.github.com>	2022-08-17 11:33:59 -04:00
Emma Nechamkin	981a36cfa3	first run -- adding NCLD data to the ETL, but not yet to the score	2022-08-17 11:11:11 -04:00
Emma Nechamkin	5e378aea81	Adding first street foundation data (#1823 ) Adding FSF flood and wildfire risk datasets to the score.	2022-08-17 10:14:23 -04:00
Emma Nechamkin	ebac552d75	Adding DOT composite to travel score (#1820 ) This adds the DOT dataset to the ETL and to the score. Note that currently we take a percentile of an average of percentiles.	2022-08-16 14:44:39 -04:00
Vim USDS	932179841f	Merge branch 'emma-nechamkin/release/score-narwhal' of https://github.com/usds/justice40-tool into emma-nechamkin/release/score-narwhal	2022-08-16 10:36:04 -07:00
Vim USDS	d6c04b1308	Disable markdown check for link	2022-08-16 10:35:57 -07:00
Matt Bowen	d5fbb802e8	Add FUDS ETL (#1817 ) * Add spatial join method (#1871) Since we'll need to figure out the tracts for a large number of points in future tickets, add a utility to handle grabbing the tract geometries and adding tract data to a point dataset. * Add FUDS, also jupyter lab (#1871) * Add YAML configs for FUDS (#1871) * Allow input geoid to be optional (#1871) * Add FUDS ETL, tests, test-datae noteobook (#1871) This adds the ETL class for Formerly Used Defense Sites (FUDS). This is different from most other ETLs since these FUDS are not provided by tract, but instead by geographic point, so we need to assign FUDS to tracts and then do calculations from there. * Floats -> Ints, as I intended (#1871) * Floats -> Ints, as I intended (#1871) * Formatting fixes (#1871) * Add test false positive GEOIDs (#1871) * Add gdal binaries (#1871) * Refactor pandas code to be more idiomatic (#1871) Per Emma, the more pandas-y way of doing my counts is using np.where to add the values i need, then groupby and size. It is definitely more compact, and also I think more correct! * Update configs per Emma suggestions (#1871) * Type fixed! (#1871) * Remove spurious import from vscode (#1871) * Snapshot update after changing col name (#1871) * Move up GDAL (#1871) * Adjust geojson strategy (#1871) * Try running census separately first (#1871) * Fix import order (#1871) * Cleanup cache strategy (#1871) * Download census data from S3 instead of re-calculating (#1871) * Clarify pandas code per Emma (#1871)	2022-08-16 13:28:39 -04:00
Emma Nechamkin	481a2a05f7	updated to fix linting errors (#1818 ) Cleans and updates base branch	2022-08-11 16:34:56 -04:00
Emma Nechamkin	94cdc47cce	Update etl_score_geo.py Yikes! Fixing merge messup!	2022-08-11 12:38:32 -04:00
Matt Bowen	97e17546cc	Refactor DOE Energy Burden and COI to use YAML (#1796 ) * added tribalId for Supplemental dataset (#1804) * Setting zoom levels for tribal map (#1810) * NRI dataset and initial score YAML configuration (#1534) * update be staging gha * NRI dataset and initial score YAML configuration * checkpoint * adding data checks for release branch * passing tests * adding INPUT_EXTRACTED_FILE_NAME to base class * lint * columns to keep and tests * update be staging gha * checkpoint * update be staging gha * NRI dataset and initial score YAML configuration * checkpoint * adding data checks for release branch * passing tests * adding INPUT_EXTRACTED_FILE_NAME to base class * lint * columns to keep and tests * checkpoint * PR Review * renoving source url * tests * stop execution of ETL if there's a YAML schema issue * update be staging gha * adding source url as class var again * clean up * force cache bust * gha cache bust * dynamically set score vars from YAML * docsctrings * removing last updated year - optional reverse percentile * passing tests * sort order * column ordening * PR review * class level vars * Updating DatasetsConfig * fix pylint errors * moving metadata hint back to code Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov> * Correct copy typo (#1809) * Add basic test suite for COI (#1518) * Update COI to use new yaml (#1518) * Add tests for DOE energy budren (1518 * Add dataset config for energy budren (1518) * Refactor ETL to use datasets.yml (#1518) * Add fake GEOIDs to COI tests (#1518) * Refactor _setup_etl_instance_and_run_extract to base (#1518) For the three classes we've done so far, a generic _setup_etl_instance_and_run_extract will work fine, for the moment we can reuse the same setup method until we decide future classes need more flexibility --- but they can also always subclass so... * Add output-path tests (#1518) * Update YAML to match constant (#1518) * Don't blindly set float format (#1518) * Add defaults for extract (#1518) * Run YAML load on all subclasses (#1518) * Update description fields (#1518) * Update YAML per final format (#1518) * Update fixture tract IDs (#1518) * Update base class refactor (#1518) Now that NRI is final I needed to make a small number of updates to my refactored code. * Remove old comment (#1518) * Fix type signature and return (#1518) * Update per code review (#1518) Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com> Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov> Co-authored-by: Vim <86254807+vim-usds@users.noreply.github.com>	2022-08-11 12:38:28 -04:00
Emma Nechamkin	baa591a6c6	first run through	2022-08-11 12:33:46 -04:00
Emma Nechamkin	4f6a1b5286	added indoor plumbing to score housing burden	2022-08-11 12:33:46 -04:00
Emma Nechamkin	15450cf91f	added indoor plumbing to score housing burden	2022-08-11 12:33:46 -04:00
Emma Nechamkin	8c7519063a	added indoor plumbing to chas	2022-08-11 12:33:46 -04:00
Emma Nechamkin	0d90ae563a	Changing LHE in tiles to a boolean (#1767 ) also includes merging / clean up of the release	2022-08-11 12:33:46 -04:00
Emma Nechamkin	b0a728437c	adds UST indicator (#1786 ) adds leaky underground storage tanks	2022-08-11 12:33:46 -04:00
Emma Nechamkin	f6efdd4e14	Rescaling linguistic isolation (#1750 ) Rescales linguistic isolation to drop puerto rico	2022-08-11 12:33:46 -04:00
Emma Nechamkin	2ab24c60fa	updating ejscreen data, try two (#1747 )	2022-08-11 12:33:46 -04:00
Shelby Switzer	3071815158	Do not drop Guam and USVI from ETL (#1681 ) * Remove code that drops Guam and USVI from ETL * Add back code for dropping rows by FIPS code We may want this functionality, so let's keep it and just make the constant currently be an empty array. Co-authored-by: Shelby Switzer <shelbyswitzer@gmail.com>	2022-08-11 12:33:46 -04:00
Shelby Switzer	05748c9fa2	Update backend for Puerto Rico (#1686 ) * Update PR threshold count to 10 We now show 10 indicators for PR. See the discussion on the github issue for more info: https://github.com/usds/justice40-tool/issues/1621 * Do not use linguistic iso for Puerto Rico Closes 1350. Co-authored-by: Shelby Switzer <shelbyswitzer@gmail.com>	2022-08-11 12:33:46 -04:00
Emma Nechamkin	1782d022a9	Adding HOLC indicator (#1579 ) Added HOLC indicator (Historic Redlining Score) from NCRC work; included 3.25 cutoff and low income as part of the housing burden category.	2022-08-11 12:33:46 -04:00
Emma Nechamkin	f047ca9d83	Imputing income using geographic neighbors (#1559 ) Imputes income field with a light refactor. Needs more refactor and more tests (I spotchecked). Next ticket will check and address but a lot of "narwhal" architecture is here.	2022-08-11 12:33:45 -04:00
Jorge Escobar	1c448a77f9	NRI dataset and initial score YAML configuration (#1534 ) * update be staging gha * NRI dataset and initial score YAML configuration * checkpoint * adding data checks for release branch * passing tests * adding INPUT_EXTRACTED_FILE_NAME to base class * lint * columns to keep and tests * update be staging gha * checkpoint * update be staging gha * NRI dataset and initial score YAML configuration * checkpoint * adding data checks for release branch * passing tests * adding INPUT_EXTRACTED_FILE_NAME to base class * lint * columns to keep and tests * checkpoint * PR Review * renoving source url * tests * stop execution of ETL if there's a YAML schema issue * update be staging gha * adding source url as class var again * clean up * force cache bust * gha cache bust * dynamically set score vars from YAML * docsctrings * removing last updated year - optional reverse percentile * passing tests * sort order * column ordening * PR review * class level vars * Updating DatasetsConfig * fix pylint errors * moving metadata hint back to code Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>	2022-08-09 16:37:10 -04:00
Jorge Escobar	781e08f559	added tribalId for Supplemental dataset (#1804 )	2022-08-08 17:42:14 -04:00
Jorge Escobar	8149ac31c5	Starting Tribal Boundaries Work (#1736 ) * starting tribal pr * further pipeline work * bia merge working * alaska villages and tribal geo generate * tribal folders * adding data full run * tile generation * tribal tile deploy	2022-07-30 01:13:10 -04:00
Vim	e1a61faf5d	Add a react component generator (#1745 ) * Add a react component generator * Update markdown links * Change commented code to block comment	2022-07-15 09:54:58 -07:00
Jorge Escobar	2af6fca98d	Columnn headers update (#1618 ) * Columnn headers update * passing tests * updated date stamp * js tests	2022-05-06 14:10:15 -04:00
Emma Nechamkin	ae725f0a3e	arcgis column name fix (#1581 ) eliminates duplicate column and ensures all column names are unique.	2022-04-22 14:09:12 -04:00
Jorge Escobar	fbd56e3bd5	Put the pdf back in the package and add TSD to pipeline (#1580 ) * Put the pdf back in the package and add TSD to pipeline * updated pdf with logo * wrong path	2022-04-21 13:42:04 -04:00
Emma Nechamkin	2ce4cfe80e	updated with codebook (#1573 )	2022-04-18 18:12:18 -04:00
Jorge Escobar	859177a877	Marshmallow Schemas for YAML files (#1497 ) * Marshmallow Schemas for YAML files * completed ticket * passing tests * lint * click dep * staging BE map * Pr review	2022-03-31 13:56:10 -04:00
Emma Nechamkin	2628afacf9	Creating a data dictionary for the download packet (#1469 ) Adding automated codebook creation. Future ticket to refactor.	2022-03-30 11:01:43 -04:00
Emma Nechamkin	dc981919f1	Adding booleans for FE to display (#1393 ) PR adds booleans for each individual threshold category for the front end to display.	2022-03-29 20:17:10 -04:00
Emma Nechamkin	0c07cdac55	Adding category count to BE signals (#1486 ) Added category count to downloadable data and backend signals.	2022-03-29 17:11:57 -04:00
Jorge Escobar	dd723b6c19	PyPi Packaging of Data Pipeline (#1464 ) * PyPi Packaging of Data Pipeline * package rename * adding python version * trigger data checks * print env vars * python version 2 * trigger data check * python version 3 * update caching for other GHAs	2022-03-21 18:55:15 -04:00
Katherine D. Mlika	68c882b3de	updating column E label to "Identified as disadvantaged" (#1406 ) * updating column E label to "Identified as disadvantaged" * passing tests * adding cached poetry flow * working dir Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>	2022-03-18 14:50:03 -04:00
Jorge Escobar	7b05ee9c76	S3 Parallel Upload and Deletions (#1410 ) * installation step * trigger action * installing to home dir * dry-run * pyenv * py 2.8 * trying s4cmd * removing pyenv * poetry s4cmd * num-threads * public read * poetry cache * s4cmd all around * poetry cache * poetry cache * install poetry packages * poetry echo * let's do this * s4cmd install on run * s4cmd * ad aws back * add aws back * testing census api key and poetry caching * census api key * census api * census api key #3 * 250 * poetry update * poetry change * check census api key * force flag * update score gen and tilefy; remove cached fips * small gdal update * invalidation * missing cache ids	2022-03-17 23:19:23 -04:00
Emma Nechamkin	e7c7c0abeb	Updating higher education to be reversed (#1387 ) Summary In this PR, we create a new variable so that the % college students is expressed as % not college students. This means that the front end can display % not college students. Includes old variables so that this will not break fe.	2022-03-15 16:43:32 -04:00
Jorge Escobar	7f91e2b06b	ArcGIS zipping (#1391 ) * ArcGIS zipping * lint * shapefile zip * removing space in GMT * adding shapefile to be staging gha	2022-03-09 18:00:20 -05:00
Emma Nechamkin	917b84dc2e	WY tracts are not showing up until zoom >7 (#1342 ) In order to solve an issue where states with few census tracts appear to have no DACs, we change the low-zoom for states with under some threshold of tracts to be the high-zoom for those states. Thus, WY now has DACs even in low zoom. Yay!	2022-03-08 17:33:11 -05:00
Jorge Escobar	6425beb9f4	YAML Config for Downloadable Assets (#1252 ) * starting yaml config load work * working version for downloadable file * yaml file update * checkpoint * sort if needed * refactoring * moving config * checkpoint * old files * skipping downloadble tests for now * more modularization * more refactor, new excel yml * pylint * completed tabs * Update excel.yml * remvoing obsolete tests * addressing PR feedback * addressing changes * confirmed change in yaml breaks tests * safety bump * PR review * adding tests back * pylint * Incorporating latest score fields from Emma * incorporating newest fields from Emma * passing tests * adding shapefile aws sync * missing test * passing tests	2022-03-04 15:02:09 -05:00
Emma Nechamkin	1f5633ef74	Adding constants for front end to display booleans (#1348 ) Added constants for the threshold categories and socioeconomic indicators for front end.	2022-03-02 17:12:28 -05:00

1 2 3 4

164 commits