Commit graph

87 commits

Author SHA1 Message Date
Lucas Merrill Brown
a4108d24c0 Issue 919: Fix too many tracts issue (#922)
* Some cleanup, adding error warning to merge function

* Error handling around tract merge
2021-11-30 13:49:20 -05:00
Jorge Escobar
16eb29e429 EJScreen Tracts (#918)
* EJScreen Tracts

* EJScreen Tracts
2021-11-30 13:49:20 -05:00
Jorge Escobar
f915e20e91 Esfoobar usds/835 census tracts geojson (#916)
* Census Tracts instead of CBGs

* typo
2021-11-30 13:49:20 -05:00
Shelby Switzer
617f41526f Update Census AMI to ETL into tracts, not CBGs (#900)
* Update Census AMI to ETL into tracts, not CBGs

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
2021-11-30 13:49:20 -05:00
Lucas Merrill Brown
537844236a Update FEMA data to be tracts, not block groups (#906) 2021-11-30 13:49:20 -05:00
Shelby Switzer
893758f1d4 Use tract instead of block group when calling census API (#901)
* Use tract instead of block group when calling census API

* fixing merge conflicts

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
2021-11-30 13:49:20 -05:00
Shelby Switzer
0c8b32e679 Move Housing and Transportation Index to tracts (#903)
Update data download URL to use tract as focus, use tract field name,
and move this dataset to the tracts df list in etl_score.

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
2021-11-30 13:49:20 -05:00
Lucas Merrill Brown
776a52595f Switching island territories data to tracts (#879) 2021-11-30 13:49:20 -05:00
Lucas Merrill Brown
474d010bf4
Quick fix on island territories directory name (#877) 2021-11-16 14:31:11 -05:00
Jorge Escobar
0a21fc6b12
Add territory boundary data (#885)
* Add territory boundary data

* housing and transp

* lint

* lint

* lint
2021-11-16 10:05:09 -05:00
Lucas Merrill Brown
21834b4a91
Issue 883: Update FEMA risk index measure (#884)
* ETL updated

* Adding three fields to score
2021-11-13 11:32:15 -05:00
Lucas Merrill Brown
05ebf9b48c
Add median house value to Definition L (#882)
* Added house value to ETL

* Adding house value to score formula and comp tool
2021-11-13 10:29:23 -05:00
Vincent La
b0dbc90064
[ISS-723] Load Census Data for 4 Territories (#816)
* Adding census decennial data for island territories
2021-11-09 16:32:46 -05:00
Jorge Escobar
053dde0d40
Display score L on map (#849)
* updates to first docker run

* tile constants

* frontend changes

* updating pickles instructions

* pickles
2021-11-05 16:26:14 -04:00
Lucas Merrill Brown
03e59f2abd
Definition L updates (#862)
* Changing FEMA risk measure 

* Adding "basic stats" feature to comparison tool 

* Tweaking Definition L
2021-11-05 15:43:52 -04:00
Lucas Merrill Brown
1d541be447
Add EJSCREEN Areas of Concern (#843)
* Adding ej screen areas of concern

* Uses it where user has local files, but not otherwise

Co-authored-by: VincentLaUSDS <vincent.la@omb.eop.gov>
2021-11-02 15:38:42 -04:00
Jorge Escobar
1b17af84c8
Combine + Tilefy (#806)
* init

* score-post

* added score csv s3 download; remore poetry cmds from readme

* working census tile fetch

* PR review

* Github Actions Work
2021-11-01 18:05:05 -04:00
Jorge Escobar
3b04356fb3
Data sources from S3 (#769)
* Started 535

* Data sources from S3

* lint

* renove breakpoints

* PR comments

* lint

* census data completed

* lint

* renaming data source
2021-10-13 16:00:33 -04:00
Lucas Merrill Brown
b1a4d26be8
Adding persistent poverty tracts (#738)
* persistent poverty working

* fixing left-padding

* running black and adding persistent poverty to comp tool

* fixing bug

* running black and fixing linter

* fixing linter

* fixing linter error
2021-09-22 17:57:08 -04:00
Vincent La
7709836a12
Ticket 355: Adding map to Urban vs Rural Census Tracts (#696)
* Adding urban vs rural notebook

* Adding new code

* Adding settings

* Adding usa.csv

* Adding etl

* Adding etl

* Adding to etl_score

* quick changes to notebook

* Ensuring notebook can run

* Adding urban vs rural notebook

* Adding new code

* Adding settings

* Adding usa.csv

* Adding etl

* Adding etl

* Adding to etl_score

* quick changes to notebook

* Ensuring notebook can run

* adding urban to comparison tool

* renaming file

* adding urban rural to more comp tool outputs

* updating requirements and poetry

* Adding ej screen notebook

* removing ej screen notebook since it's in justice40-tool-iss-719

Co-authored-by: La <ryy0@cdc.gov>
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
2021-09-22 12:31:03 -04:00
Lucas Merrill Brown
1c0d87d84b
Add FEMA risk index to score file (#687)
* Add to score file
2021-09-15 13:31:32 -05:00
Lucas Merrill Brown
e94d05882c
Issue 675 & 676: Adding life expectancy and DOE energy burden data (#683)
* Adding two new data sources.
2021-09-15 09:59:28 -05:00
Jorge Escobar
fc5ed37fca
dependabot bump pillow (#681)
* dependabot bump pillow

* updated poetry

* adding encoding to file open
2021-09-14 17:28:59 -04:00
Lucas Merrill Brown
1083e953da
Prototype G (#672)
* wip

* cleanup

* cleanup 2

* fixing import ordering linter error

* updating backend to use score G

* adding percentile to score output

* update tippeanoe compression

Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>
2021-09-14 10:48:11 -04:00
Lucas Merrill Brown
7d13be7651
Ticket 492: Integrate Area Median Income and Poverty measures into ETL (#660)
* Loading AMI and poverty data
2021-09-13 15:36:35 -05:00
Shelby Switzer
ac62933d16
Initial refactor for Score ETL (#618)
* WIP refactor

* Exract score calculations into their own methods

* do all initial df prep in single method

* Fix error in docs for running etl for single dataset

* WIP understanding HUD and linguistic iso data

* Add comments from initial group review on PR

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
2021-09-10 10:34:34 -04:00
Billy Daly
f0900f7b69
Adds National Risk Index data to ETL pipeline (#549)
* Adds dev dependencies to requirements.txt and re-runs black on codebase

* Adds test and code for national risk index etl, still in progress

* Removes test_data from .gitignore

* Adds test data to nation_risk_index tests

* Creates tests and ETL class for NRI data

* Adds tests for load() and transform() methods of NationalRiskIndexETL

* Updates README.md with info about the NRI dataset

* Adds to dos

* Moves tests and test data into a tests/ dir in national_risk_index

* Moves tmp_dir for tests into data/tmp/tests/

* Promotes fixtures to conftest and relocates national_risk_index tests:
The relocation of national_risk_index tests is necessary because tests 
can only use fixtures specified in conftests within the same package

* Fixes issue with df.equals() in test_transform()

* Files reformatted by black

* Commit changes to other files after re-running black

* Fixes unused import that caused lint checks to fail

* Moves tests/ directory to app root for data_pipeline
2021-09-07 20:51:34 -04:00
Lucas Merrill Brown
65ceb7900f
Score F, testing methodology (#510)
* fixing dependency issue

* fixing more dependencies

* including fraction of state AMI

* wip

* nitpick whitespace

* etl working now

* wip on scoring

* fix rename error

* reducing metrics

* fixing score f

* fixing readme

* adding dependency

* passing tests;

* linting/black

* removing unnecessary sample

* fixing error

* adding verify flag on etl/base

Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>
2021-08-24 16:40:54 -04:00
Jorge Escobar
773c035493
AWS Sync Public Read (#508)
* adding layer to mvts

* small fix for GHA

* AWS Sync Public Read

* removed temp file

* updated state media income ftp
2021-08-12 14:17:25 -04:00
lucasmbrown-usds
ce5e8c5351 including fraction of state AMI 2021-08-09 21:30:41 -05:00
lucasmbrown-usds
4ae7eff4c4 adding median income field and running black 2021-08-09 20:47:51 -05:00
Nat Hillard
ec19d86f6f
Adding back census to list of potential datasets, but separating out from standard list (#484)
Error this addresses:
  File "/Users/lucas/Documents/usds/repos/justice40-tool/data/data-pipeline/data_pipeline/etl/runner.py", line 71, in etl_runner
    f"data_pipeline.etl.sources.{dataset['module_dir']}.etl"
TypeError: 'NoneType' object is not subscriptable
2021-08-09 09:52:06 -04:00
Jorge Escobar
f51b0d69d9
Poetry updates for application (#483) 2021-08-06 16:24:30 -04:00
Nat Hillard
9d962eb5d9
Moving from relative imports to absolute to enable poetry run python data-pipeline/application.py [command] (#476) 2021-08-06 11:41:28 -04:00
Nat Hillard
45a8b1c026
Census ETL should use standard ETL form (#474)
* Fixes #473
Census ETL should use standard ETL form

* linter fixes
2021-08-06 11:01:51 -04:00
Nat Hillard
9f3b2f056b
Fixes #467: (#470)
If the census download task is run more than once,
us.csv doubles in size and all data is removed from dataframe
2021-08-05 16:20:18 -04:00
Nat Hillard
c1568e87c0
Data directory should adopt standard Poetry-suggested python package structure (#457)
* Fixes #456 - Our data directory should adopt standard python package structure
* a few missed references
* updating readme
* updating requirements
* Running Black
* Fixes for flake8
* updating pylint
2021-08-05 15:35:54 -04:00