Commit graph

8 commits

Author SHA1 Message Date
Lucas Merrill Brown
a4108d24c0 Issue 919: Fix too many tracts issue (#922)
* Some cleanup, adding error warning to merge function

* Error handling around tract merge
2021-11-30 13:49:20 -05:00
Jorge Escobar
0a21fc6b12
Add territory boundary data (#885)
* Add territory boundary data

* housing and transp

* lint

* lint

* lint
2021-11-16 10:05:09 -05:00
Billy Daly
d1273b63c5
Add ETL Contract Checks (#619)
* Adds dev dependencies to requirements.txt and re-runs black on codebase

* Adds test and code for national risk index etl, still in progress

* Removes test_data from .gitignore

* Adds test data to nation_risk_index tests

* Creates tests and ETL class for NRI data

* Adds tests for load() and transform() methods of NationalRiskIndexETL

* Updates README.md with info about the NRI dataset

* Adds to dos

* Moves tests and test data into a tests/ dir in national_risk_index

* Moves tmp_dir for tests into data/tmp/tests/

* Promotes fixtures to conftest and relocates national_risk_index tests:
The relocation of national_risk_index tests is necessary because tests 
can only use fixtures specified in conftests within the same package

* Fixes issue with df.equals() in test_transform()

* Files reformatted by black

* Commit changes to other files after re-running black

* Fixes unused import that caused lint checks to fail

* Moves tests/ directory to app root for data_pipeline

* Adds new methods to ExtractTransformLoad base class:
- __init__() Initializes class attributes
- _get_census_fips_codes() Loads a dataframe with the fips codes for 
census block group and tract
- validate_init() Checks that the class was initialized correctly
- validate_output() Checks that the output was loaded correctly

* Adds test for ExtractTransformLoad.__init__() and base.py

* Fixes failing flake8 test

* Changes geo_col to geoid_col and changes is_dataset to is_census in yaml

* Adds test for validate_output()

* Adds remaining tests

* Removes is_dataset from init method

* Makes CENSUS_CSV a class attribute instead of a class global:
This ensures that CENSUS_CSV is only set when the ETL class is for a 
non-census dataset and removes the need to overwrite the value in 
mock_etl fixture

* Re-formats files with black and fixes broken tox tests
2021-10-13 15:54:15 -04:00
Lucas Merrill Brown
b1a4d26be8
Adding persistent poverty tracts (#738)
* persistent poverty working

* fixing left-padding

* running black and adding persistent poverty to comp tool

* fixing bug

* running black and fixing linter

* fixing linter

* fixing linter error
2021-09-22 17:57:08 -04:00
Shelby Switzer
d3a18352fc
Add pytest to tox run in CI/CD (#713)
* Add pytest to tox run in CI/CD

* Try fixing tox dependencies for pytest

* update poetry to get ci/cd passing

* Run poetry export with --dev flag to include dev dependencies such as pytest

* WIP updating test fixtures to include PDF

* Remove dev dependencies from reqs and add pytest to envlist to make build faster

* passing score_post tests

* Add pytest tox (#729)

* Fix failing pytest

* Fixes failing tox tests and updates requirements.txt to include dev deps

* pickle protocol 4

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>
Co-authored-by: Billy Daly <williamdaly422@gmail.com>
Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>
2021-09-22 13:47:37 -04:00
Lucas Merrill Brown
7d13be7651
Ticket 492: Integrate Area Median Income and Poverty measures into ETL (#660)
* Loading AMI and poverty data
2021-09-13 15:36:35 -05:00
Lucas Merrill Brown
65ceb7900f
Score F, testing methodology (#510)
* fixing dependency issue

* fixing more dependencies

* including fraction of state AMI

* wip

* nitpick whitespace

* etl working now

* wip on scoring

* fix rename error

* reducing metrics

* fixing score f

* fixing readme

* adding dependency

* passing tests;

* linting/black

* removing unnecessary sample

* fixing error

* adding verify flag on etl/base

Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>
2021-08-24 16:40:54 -04:00
Nat Hillard
c1568e87c0
Data directory should adopt standard Poetry-suggested python package structure (#457)
* Fixes #456 - Our data directory should adopt standard python package structure
* a few missed references
* updating readme
* updating requirements
* Running Black
* Fixes for flake8
* updating pylint
2021-08-05 15:35:54 -04:00
Renamed from data/data-pipeline/etl/base.py (Browse further)