j40-cejst-2

mirror of https://github.com/DOI-DO/j40-cejst-2.git synced 2025-02-23 10:04:18 -08:00

Author	SHA1	Message	Date
Shelby Switzer	ac62933d16	Initial refactor for Score ETL (#618 ) * WIP refactor * Exract score calculations into their own methods * do all initial df prep in single method * Fix error in docs for running etl for single dataset * WIP understanding HUD and linguistic iso data * Add comments from initial group review on PR Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>	2021-09-10 10:34:34 -04:00
Jorge Escobar	327e27e713	Add Score D to USA Low (#629 ) * added score D * Adding Score D to usa-low * rounding score d * small vscode update * last couple of vscode changes * uncommited bscode changes	2021-09-08 16:44:26 -04:00
Billy Daly	f0900f7b69	Adds National Risk Index data to ETL pipeline (#549 ) * Adds dev dependencies to requirements.txt and re-runs black on codebase * Adds test and code for national risk index etl, still in progress * Removes test_data from .gitignore * Adds test data to nation_risk_index tests * Creates tests and ETL class for NRI data * Adds tests for load() and transform() methods of NationalRiskIndexETL * Updates README.md with info about the NRI dataset * Adds to dos * Moves tests and test data into a tests/ dir in national_risk_index * Moves tmp_dir for tests into data/tmp/tests/ * Promotes fixtures to conftest and relocates national_risk_index tests: The relocation of national_risk_index tests is necessary because tests can only use fixtures specified in conftests within the same package * Fixes issue with df.equals() in test_transform() * Files reformatted by black * Commit changes to other files after re-running black * Fixes unused import that caused lint checks to fail * Moves tests/ directory to app root for data_pipeline	2021-09-07 20:51:34 -04:00
Jorge Escobar	94298635c2	Add to decimal rounding (#623 ) * added score D * forgot to add decimal rounding	2021-09-07 14:30:45 -04:00
Jorge Escobar	99503a2541	added score D (#621 )	2021-09-07 13:37:16 -04:00
Lucas Merrill Brown	65ceb7900f	Score F, testing methodology (#510 ) * fixing dependency issue * fixing more dependencies * including fraction of state AMI * wip * nitpick whitespace * etl working now * wip on scoring * fix rename error * reducing metrics * fixing score f * fixing readme * adding dependency * passing tests; * linting/black * removing unnecessary sample * fixing error * adding verify flag on etl/base Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>	2021-08-24 16:40:54 -04:00
Jorge Escobar	c24e13c930	Update GHA to push only client changes to S3 (#543 )	2021-08-16 17:00:43 -04:00
Jorge Escobar	c19cd3ee55	hotfix on float cols (#526 )	2021-08-13 15:48:31 -04:00
Vim	1dbb1018d6	sets column as percentiles (#525 ) * sets column as percentiles * adds trailing comma	2021-08-13 12:01:34 -07:00
Jorge Escobar	773c035493	AWS Sync Public Read (#508 ) * adding layer to mvts * small fix for GHA * AWS Sync Public Read * removed temp file * updated state media income ftp	2021-08-12 14:17:25 -04:00
Jorge Escobar	d259d97ba9	adding layer to mvts (#503 ) * adding layer to mvts * small fix for GHA	2021-08-12 10:56:54 -04:00
Jorge Escobar	6dc1283ee2	added comment	2021-08-10 15:37:36 -04:00
Jorge Escobar	3d8dbb293c	Tile-baking columns with floating rounds completed (#491 ) * Tile-baking columns with floating rounds completed * completed * correction on github workflow * tiles folder no longer needed * addressed comments * updating requirements.txt * poetry lock update * adding xlswriter * final poetrylock * updated requirements.txt * checkpoint * removed matplotlib * ignoring pylint too many statements * reinstated too many statements * converting data sync to generate score GHA UI-driven	2021-08-10 15:28:50 -04:00
lucasmbrown-usds	ebe6180f7c	wip	2021-08-09 22:24:14 -05:00
lucasmbrown-usds	cf13036d20	clearing output	2021-08-09 21:31:07 -05:00
lucasmbrown-usds	ce5e8c5351	including fraction of state AMI	2021-08-09 21:30:41 -05:00
lucasmbrown-usds	4ae7eff4c4	adding median income field and running black	2021-08-09 20:47:51 -05:00
Nat Hillard	9a9d5fdf7f	Backend change for Zipfile pt. 2 (#469 ) * Fixes #303 : adding downloadable zip archive logic * linter recommendations * Pushes data directory to AWS. We'll want to move to use AWS for this ASAP, but this works for now * updating pattern	2021-08-09 10:39:59 -04:00
Nat Hillard	ec19d86f6f	Adding back census to list of potential datasets, but separating out from standard list (#484 ) Error this addresses: File "/Users/lucas/Documents/usds/repos/justice40-tool/data/data-pipeline/data_pipeline/etl/runner.py", line 71, in etl_runner f"data_pipeline.etl.sources.{dataset['module_dir']}.etl" TypeError: 'NoneType' object is not subscriptable	2021-08-09 09:52:06 -04:00
Jorge Escobar	f51b0d69d9	Poetry updates for application (#483 )	2021-08-06 16:24:30 -04:00
Nat Hillard	6fb36ded9c	adding additional missed import (#477 )	2021-08-06 11:48:11 -04:00
Nat Hillard	9d962eb5d9	Moving from relative imports to absolute to enable poetry run python data-pipeline/application.py [command] (#476 )	2021-08-06 11:41:28 -04:00
Nat Hillard	45a8b1c026	Census ETL should use standard ETL form (#474 ) * Fixes #473 Census ETL should use standard ETL form * linter fixes	2021-08-06 11:01:51 -04:00
Nat Hillard	9f3b2f056b	Fixes #467 : (#470 ) If the census download task is run more than once, us.csv doubles in size and all data is removed from dataframe	2021-08-05 16:20:18 -04:00
Nat Hillard	c1568e87c0	Data directory should adopt standard Poetry-suggested python package structure (#457 ) * Fixes #456 - Our data directory should adopt standard python package structure * a few missed references * updating readme * updating requirements * Running Black * Fixes for flake8 * updating pylint	2021-08-05 15:35:54 -04:00

1 2 3 4

175 commits