Commit graph

32 commits

Author SHA1 Message Date
Lucas Merrill Brown
e2641fe2a6
Issue 1992: Do not impute income for null population tracts (#1993) 2022-10-07 11:28:11 -04:00
Lucas Merrill Brown
6e6223cd5e
Issue 105: Configure and run black and other pre-commit hooks (clean branch) (#1962)
* Configure and run `black` and other pre-commit hooks

Co-authored-by: matt bowen <matthew.r.bowen@omb.eop.gov>
2022-10-04 18:08:47 -04:00
Lucas Merrill Brown
9fb9874a15
Issue 1910: Do not impute income for 0 population tracts (#1918)
* should be working, has unnecessary loggers

* removing loggers and cleaning up

* updating ejscreen tests

* adding tests and responding to PR feedback

* fixing broken smoke test

* delete smoketest docs
2022-09-26 11:00:21 -04:00
Emma Nechamkin
1c4d3e4142
Score tests (#1847)
* update Python version on README; tuple typing fix

* Alaska tribal points fix (#1821)

* Bump mistune from 0.8.4 to 2.0.3 in /data/data-pipeline (#1777)

Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3.
- [Release notes](https://github.com/lepture/mistune/releases)
- [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst)
- [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3)

---
updated-dependencies:
- dependency-name: mistune
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* poetry update

* initial pass of score tests

* add threshold tests

* added ses threshold (not donut, not island)

* testing suite -- stopping for the day

* added test for lead proxy indicator

* Refactor score tests to make them less verbose and more direct (#1865)

* Cleanup tests slightly before refactor (#1846)

* Refactor score calculations tests

* Feedback from review

* Refactor output tests like calculatoin tests (#1846) (#1870)

* Reorganize files (#1846)

* Switch from lru_cache to fixture scorpes (#1846)

* Add tests for all factors (#1846)

* Mark smoketests and run as part of be deply (#1846)

* Update renamed var (#1846)

* Switch from named tuple to dataclass (#1846)

This is annoying, but pylint in python3.8 was crashing parsing the named
tuple. We weren't using any namedtuple-specific features, so I made the
type a dataclass just to get pylint to behave.

* Add default timout to requests (#1846)

* Fix type (#1846)

* Fix merge mistake on poetry.lock (#1846)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>
Co-authored-by: Jorge Escobar <83969469+esfoobar-usds@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matt Bowen <83967628+mattbowen-usds@users.noreply.github.com>
Co-authored-by: matt bowen <matthew.r.bowen@omb.eop.gov>
2022-08-26 15:23:20 -04:00
Jorge Escobar
e539db86ab tuple type 2022-08-26 13:11:51 -04:00
Lucas Merrill Brown
4bf7773797
Issue 1827: Add demographics to tiles and download files (#1833)
* Adding demographics for use in sidebar and download files
2022-08-22 10:05:23 -04:00
Emma Nechamkin
d892bce6cf
Fast flag update (#1844)
Added additional flags for the front end based on our conversation in stand up this morning.
2022-08-19 13:14:44 -04:00
Emma Nechamkin
481a2a05f7
updated to fix linting errors (#1818)
Cleans and updates base branch
2022-08-11 16:34:56 -04:00
Emma Nechamkin
f047ca9d83 Imputing income using geographic neighbors (#1559)
Imputes income field with a light refactor. Needs more refactor and more tests (I spotchecked). Next ticket will check and address but a lot of "narwhal" architecture is here.
2022-08-11 12:33:45 -04:00
Jorge Escobar
7b05ee9c76
S3 Parallel Upload and Deletions (#1410)
* installation step

* trigger action

* installing to home dir

* dry-run

* pyenv

* py 2.8

* trying s4cmd

* removing pyenv

* poetry s4cmd

* num-threads

* public read

* poetry cache

* s4cmd all around

* poetry cache

* poetry cache

* install poetry packages

* poetry echo

* let's do this

* s4cmd install on run

* s4cmd

* ad aws back

* add aws back

* testing census api key and poetry caching

* census api key

* census api

* census api key #3

* 250

* poetry update

* poetry change

* check census api key

* force flag

* update score gen and tilefy; remove cached fips

* small gdal update

* invalidation

* missing cache ids
2022-03-17 23:19:23 -04:00
Emma Nechamkin
e7c7c0abeb
Updating higher education to be reversed (#1387)
Summary In this PR, we create a new variable so that the % college students is expressed as % not college students. This means that the front end can display % not college students.

Includes old variables so that this will not break fe.
2022-03-15 16:43:32 -04:00
Lucas Merrill Brown
43e005cc10
Issue 1075: Add refactored ETL tests to NRI (#1088)
* Adds a substantially refactored ETL test to the National Risk Index, to be used as a model for other tests
2022-02-08 19:05:32 -05:00
Jorge Escobar
d686bb856e
Download column order completed (#1077)
* Download column order completed

* Kameron changes

* Lucas and Beth column order changes

* cdc_places update

* passing score

* pandas error

* checkpoint

* score passing

* rounding complete - percentages still showing one decimal

* fixing tests

* fixing percentages

* updating comment

* int percentages! 🎉🎉

* forgot to pass back to df

* passing tests

Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
2022-01-13 15:04:16 -05:00
Lucas Merrill Brown
0d57dd572b
Stop swallowing Census API errors (#1051) 2021-12-16 10:54:41 -05:00
Lucas Merrill Brown
5a6d6d8557
Issue 954: Add various data sources from Child Opportunity Index (#986)
* Adds four fields:
    * Summer days above 90F
    * Percent low access to healthy food
    * Percent impenetrable surface areas
    * Low third grade reading proficiency

* Each of these four gets added into Definition L in various factors.

* Additionally, I add college attendance fields to the ETL for Census ACS.

* This PR also introduces the notion of "reverse percentiles", relevant to ticket #970.
2021-12-07 11:33:49 -05:00
Lucas Merrill Brown
d705a8244c
adding demographics information to ETL source data (#982) 2021-12-05 17:56:45 -05:00
Lucas Merrill Brown
c5dff6e5f7
Issue 242: Add HOLC Grades to data inputs (#978)
* Add mapping inequality data to data inputs

* Add mapping inequality data to comparison tool
2021-12-04 12:23:01 -05:00
Lucas Merrill Brown
1d101c93d2
Issue 844: Add island areas to Definition L (#957)
This ended up being a pretty large task. Here's what this PR does:

1. Pulls in Vincent's data from island areas into the score ETL. This is from the 2010 decennial census, the last census of any kind in the island areas.
2. Grabs a few new fields from 2010 island areas decennial census.
3. Calculates area median income for island areas.
4. Stops using EJSCREEN as the source of our high school education data and directly pulls that from census (this was related to this project so I went ahead and fixed it).
5. Grabs a bunch of data from the 2010 ACS in the states/Puerto Rico/DC, so that we can create percentiles comparing apples-to-apples (ish) from 2010 island areas decennial census data to 2010 ACS data. This required creating a new class because all the ACS fields are different between 2010 and 2019, so it wasn't as simple as looping over a year parameter.
6. Creates a combined population field of island areas and mainland so we can use those stats in our comparison tool, and updates the comparison tool accordingly.
2021-12-03 15:46:10 -05:00
Lucas Merrill Brown
5c65eed28f
Issue 838: Update comparison tool to use tracts (#934)
* Updating comparison tool to use tracts, and rely more heavily on `field_names`
2021-11-30 18:46:29 -05:00
Lucas Merrill Brown
d2352c6217 Fix too many tracts in join error in ACS (#933) 2021-11-30 13:49:21 -05:00
Shelby Switzer
893758f1d4 Use tract instead of block group when calling census API (#901)
* Use tract instead of block group when calling census API

* fixing merge conflicts

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
Co-authored-by: lucasmbrown-usds <lucas.m.brown@omb.eop.gov>
2021-11-30 13:49:20 -05:00
Jorge Escobar
0a21fc6b12
Add territory boundary data (#885)
* Add territory boundary data

* housing and transp

* lint

* lint

* lint
2021-11-16 10:05:09 -05:00
Lucas Merrill Brown
05ebf9b48c
Add median house value to Definition L (#882)
* Added house value to ETL

* Adding house value to score formula and comp tool
2021-11-13 10:29:23 -05:00
Lucas Merrill Brown
e94d05882c
Issue 675 & 676: Adding life expectancy and DOE energy burden data (#683)
* Adding two new data sources.
2021-09-15 09:59:28 -05:00
Lucas Merrill Brown
1083e953da
Prototype G (#672)
* wip

* cleanup

* cleanup 2

* fixing import ordering linter error

* updating backend to use score G

* adding percentile to score output

* update tippeanoe compression

Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>
2021-09-14 10:48:11 -04:00
Lucas Merrill Brown
7d13be7651
Ticket 492: Integrate Area Median Income and Poverty measures into ETL (#660)
* Loading AMI and poverty data
2021-09-13 15:36:35 -05:00
Shelby Switzer
ac62933d16
Initial refactor for Score ETL (#618)
* WIP refactor

* Exract score calculations into their own methods

* do all initial df prep in single method

* Fix error in docs for running etl for single dataset

* WIP understanding HUD and linguistic iso data

* Add comments from initial group review on PR

Co-authored-by: Shelby Switzer <shelby.switzer@cms.hhs.gov>
2021-09-10 10:34:34 -04:00
Lucas Merrill Brown
65ceb7900f
Score F, testing methodology (#510)
* fixing dependency issue

* fixing more dependencies

* including fraction of state AMI

* wip

* nitpick whitespace

* etl working now

* wip on scoring

* fix rename error

* reducing metrics

* fixing score f

* fixing readme

* adding dependency

* passing tests;

* linting/black

* removing unnecessary sample

* fixing error

* adding verify flag on etl/base

Co-authored-by: Jorge Escobar <jorge.e.escobar@omb.eop.gov>
2021-08-24 16:40:54 -04:00
Jorge Escobar
773c035493
AWS Sync Public Read (#508)
* adding layer to mvts

* small fix for GHA

* AWS Sync Public Read

* removed temp file

* updated state media income ftp
2021-08-12 14:17:25 -04:00
lucasmbrown-usds
ce5e8c5351 including fraction of state AMI 2021-08-09 21:30:41 -05:00
lucasmbrown-usds
4ae7eff4c4 adding median income field and running black 2021-08-09 20:47:51 -05:00
Nat Hillard
c1568e87c0
Data directory should adopt standard Poetry-suggested python package structure (#457)
* Fixes #456 - Our data directory should adopt standard python package structure
* a few missed references
* updating readme
* updating requirements
* Running Black
* Fixes for flake8
* updating pylint
2021-08-05 15:35:54 -04:00