Commit graph

275 commits

Author SHA1 Message Date
Jorge Escobar
6dc1283ee2 added comment 2021-08-10 15:37:36 -04:00
Jorge Escobar
3d8dbb293c
Tile-baking columns with floating rounds completed (#491)
* Tile-baking columns with floating rounds completed

* completed

* correction on github workflow

* tiles folder no longer needed

* addressed comments

* updating requirements.txt

* poetry lock update

* adding xlswriter

* final poetrylock

* updated requirements.txt

* checkpoint

* removed matplotlib

* ignoring pylint too many statements

* reinstated too many statements

* converting data sync to generate score GHA UI-driven
2021-08-10 15:28:50 -04:00
lucasmbrown-usds
ebe6180f7c wip 2021-08-09 22:24:14 -05:00
lucasmbrown-usds
cf13036d20 clearing output 2021-08-09 21:31:07 -05:00
lucasmbrown-usds
ce5e8c5351 including fraction of state AMI 2021-08-09 21:30:41 -05:00
lucasmbrown-usds
4ae7eff4c4 adding median income field and running black 2021-08-09 20:47:51 -05:00
Nat Hillard
6c986adfe4
Check in VSCode config for easier local debug (#487)
* Fixes #466 - Task: Check in VSCode config for easier local debug
2021-08-09 14:55:13 -04:00
Nat Hillard
9a9d5fdf7f
Backend change for Zipfile pt. 2 (#469)
* Fixes #303 : adding downloadable zip archive logic
* linter recommendations
* Pushes data directory to AWS. We'll want to move to use AWS for this ASAP, but this works for now
* updating pattern
2021-08-09 10:39:59 -04:00
Nat Hillard
ec19d86f6f
Adding back census to list of potential datasets, but separating out from standard list (#484)
Error this addresses:
  File "/Users/lucas/Documents/usds/repos/justice40-tool/data/data-pipeline/data_pipeline/etl/runner.py", line 71, in etl_runner
    f"data_pipeline.etl.sources.{dataset['module_dir']}.etl"
TypeError: 'NoneType' object is not subscriptable
2021-08-09 09:52:06 -04:00
Jorge Escobar
f51b0d69d9
Poetry updates for application (#483) 2021-08-06 16:24:30 -04:00
Nat Hillard
6fb36ded9c
adding additional missed import (#477) 2021-08-06 11:48:11 -04:00
Nat Hillard
9d962eb5d9
Moving from relative imports to absolute to enable poetry run python data-pipeline/application.py [command] (#476) 2021-08-06 11:41:28 -04:00
Nat Hillard
45a8b1c026
Census ETL should use standard ETL form (#474)
* Fixes #473
Census ETL should use standard ETL form

* linter fixes
2021-08-06 11:01:51 -04:00
Nat Hillard
9f3b2f056b
Fixes #467: (#470)
If the census download task is run more than once,
us.csv doubles in size and all data is removed from dataframe
2021-08-05 16:20:18 -04:00
Nat Hillard
c1568e87c0
Data directory should adopt standard Poetry-suggested python package structure (#457)
* Fixes #456 - Our data directory should adopt standard python package structure
* a few missed references
* updating readme
* updating requirements
* Running Black
* Fixes for flake8
* updating pylint
2021-08-05 15:35:54 -04:00
Jorge Escobar
4d7465c833
Hotfix for fips zip download location + added full-score-run command (#465)
* Hotfix for S3 locations of data sources

* updated README

* lint failures

Co-authored-by: Nat Hillard <Nathaniel.K.Hillard@omb.eop.gov>
2021-08-05 12:55:21 -04:00
Jorge Escobar
5cb00ef0ce
Tile Generation Script (#433) 2021-08-03 18:23:57 -04:00
Billy Daly
5504528fdf
Issue 308 python linting (#443)
* Adds flake8, pylint, liccheck, flake8 to dependencies for data-pipeline

* Sets up and runs black autoformatting

* Adds flake8 to tox linting

* Fixes flake8 error F541 f string missing placeholders

* Fixes flake8 E501 line too long

* Fixes flake8 F401 imported but not used

* Adds pylint to tox and disables the following pylint errors:
- C0114: module docstrings
- R0201: method could have been a function
- R0903: too few public methods
- C0103: name case styling
- W0511: fix me
- W1203: f-string interpolation in logging

* Adds utils.py to tox.ini linting, runs black on utils.py

* Fixes import related pylint errors: C0411 and C0412

* Fixes or ignores remaining pylint errors (for discussion later)

* Adds safety and liccheck to tox.ini
2021-08-02 12:16:38 -04:00
Billy Daly
55dabb2b57
Issue 379 tox setup (#405)
* Adds tox as a dev dependency to data/data-pipeline/pyproject.toml: Also updates poetry.lock and requirements.txt

* Adds tox.ini to test build of data/data-pipeline

* Sets up GitHub actions workflow for data/ directory

* Tries to get Data Checks GitHub action to run

* Fixes error with GitHub action

* Migrates data/data-roadmap from setuptools to poetry

* Sets up tox file for data/data-roadmap

* Adds github action for data/data-roadmap

* Fixes syntax error in data-checks.yml

* Second attempt at fixing data-checks.yml

* Export poetry requirements to requirements.txt

* Revert "Migrates data/data-roadmap from setuptools to poetry"

This reverts commit e8367652d43c1c9beee500f792c8f41e1c1fc462.

* Removes pyproject.toml and reverts requirements.txt as well
2021-07-29 14:00:20 -04:00
Shelby Switzer
387ee3a382
Update data documentation and some data steps (#407)
* Minor documentation updates, plus calenvironscreen  S3 URL fix

* Update score comparison docs and code

* Add steps for running the comparison tool
* Update HUD recap ETL to ensure GEOID is imported as a string (if it is
imported as an interger by default it  will strip the beginning "0" from
many IDs)

* Add note about execution time

* Move step from paragraph to list

* Update output dir in README for comp tool

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
2021-07-29 10:28:52 -04:00
Jorge Escobar
b404fdcc43
Generate Geo-aware scores for all zoom levels (#391)
* generate Geo-aware scores for all zoom levels

* usa high progress

* testing dissolve

* checkpoint

* changing type

* removing breakpoint

* validation notebooks

* quick update

* score validation

* fixes for county merge

* code completed
2021-07-28 16:07:28 -04:00
Lucas Merrill Brown
67b39475f7
Analysis by region (#385)
* Adding regional comparisons

* Small ETL fixes
2021-07-26 10:02:25 -05:00
Rohit Musti
81290ce672
adding tree equity score to the data pipeline (#398)
* adding tree equity score to the downloading pipeline so it can be easily compared as a reference index!

* removed redundant dependencies
2021-07-26 08:00:57 -04:00
Nat Hillard
a7cdf1c021
Adding notebook to create score dissolve (#333) 2021-07-21 16:10:32 -04:00
Jorge Escobar
543d147e61
Data folder restructuring in preparation for 361 (#376)
* initial checkin

* gitignore and docker-compose update

* readme update and error on hud

* encoding issue

* one more small README change

* data roadmap re-strcuture

* pyproject sort

* small update to score output folders

* checkpoint

* couple of last fixes
2021-07-20 14:55:39 -04:00